Supplementary Material of "NoRBERT: Transfer Learning for Requirements Classification"

doi:10.5281/zenodo.3874137

Published May 21, 2020 | Version v3

Other Open

Supplementary Material of "NoRBERT: Transfer Learning for Requirements Classification"

1. Karlsruhe Institute of Technology (KIT)

This is the supplementary material of the paper "NoRBERT: Transfer Learning for Requirements Classification" at RE20.

In this paper we explore the performance of transfer learning (with Google's language model BERT) on different tasks in requirements classification. Especially the performance on projects, completely unseen during training, is in the focus of the paper.
Additionally, we developed a new dataset based on the Promise NFR dataset, that includes a more fine-grained labeling of functional requirement based on their concerns (Function, Data, Behavior).

This repository contains the datasets and code used in the paper, as well as additional results:

Dataset contains the labeled dataset for the classification of functional requirements concerns (based on Promise NFR dataset) as well as information about our labeling (results of each annotator and Krippendorf's Alpha, KALPHA)
Code contains the python notebooks (code) and datasets used for
- Task 1: Binary F/NFR classification (on Promise NFR dataset)
- Task 2: Classification of most frequent NFR subclasses (on Promise NFR dataset)
- Task 3: Classification of all NFR subclasses (on Promise NFR dataset)
- Task 4: Functional and Quality aspects classification (on relabeled Promise NFR dataset)
- Task 5: Classification of functional requirement concerns (on functional concerns dataset)
- Notebooks to apply pretrained models for each task to an input requirement and pretrained models for each task
Results contains the results of all tested hyperparameter configurations for each task

Note that we are not able to provide the actual models that were used to produce the results of the paper.
We used cross validation experiments that would result in a huge amount of model files per experiment run on each task.
As the model files are quite large this is not feasible.
The results may still be reproduced with the supplied notebooks.

Attribution (of datasets used):

The Promise Dataset can be attributed to Jane Cleland-Huang and was provided for the RE'17 Data Challenge.
Jane Cleland-Huang, Sepideh Mazrouee, Huang Liguo, & Dan Port. (2007). nfr [Data set]. Zenodo. Available: http://doi.org/10.5281/zenodo.268542
RE'17 Data Challenge: http://ctp.di.fct.unl.pt/RE2017/pages/submission/data_papers/
See also: Sayyad Shirabad, J. and Menzies, T.J. (2005) The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada. Available: http://promise.site.uottawa.ca/SERepository

The relabeled dataset can be attributed to Dalpiaz et al: F. Dalpiaz, D. Dell’Anna, F. B. Aydemir, and S. Çevikol, “explainable-re/re-2019-materials,” Jul.2019. https://doi.org/10.5281/zenodo.3309669

Files

NoRBERT_pretrained_models.zip

Files (3.6 GB)

Name	Size	Download all
NoRBERT_pretrained_models.zip md5:abb275d0a09f53f4f3bb1c5c61f9b675	3.6 GB	Preview Download
NoRBERT_RE20_Paper65.zip md5:cc004263612ab3c5b02507d19b113202	239.9 kB	Preview Download

Additional details

References: Dataset: 10.5281/zenodo.3309582 (DOI); Dataset: 10.5281/zenodo.268542 (DOI)

	All versions	This version
Views	2,169	595
Downloads	926	183
Data volume	1.2 TB	246.0 GB

Supplementary Material of "NoRBERT: Transfer Learning for Requirements Classification"

Creators

Description

Files

NoRBERT_pretrained_models.zip

Files (3.6 GB)

Additional details

Related works