Data augmentation for Multi-Classification of Non-Functional Requirements - Dataset
Description
There are four datasets:
1.Dataset_structure indicates the structure of the datasets, such as column name, type, and value.
2. Spanish_promise_exp_nfr_train and Spanish_promise_exp_nfr_test are the non-functional requirements of the Promise_exp[1] dataset translated into the Spanish language.
3. Balanced_promise_exp_nfr_train is the new balanced dataset of Spanish_promise_exp_nfr_train, in which the Data Augmentation technique with chatGPT was applied to increase the requirements with little data and random undersampling was used to eliminate requirements.
The labeling schema, similar to PROMISE NFR, includes the following categories: A: Availability, PO: Portability, L: Legal, FT: Fault tolerance, SC: Scalability, MN: Maintainability, LF: Look and feel, PE: Performance, O: Operational. US: Usability, and SE: Security.
Files
Balanced_promise_exp_nfr_train.csv
Additional details
References
- [1]Lima, M., Valle, V., Costa, E., Lira, F., & Gadelha, B. (2019, September). Software engineering repositories: expanding the promise database. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering (pp. 427-436).