Natural Language Inference Dataset for Software Engineering
Authors/Creators
Description
Active research in requirements engineering and software engineering necessitates the application of Natural Language Processing (NLP) techniques to address unique challenges and enhance software quality. However, there is a dearth of effective Natural Language Inference (NLI) datasets for training neural network models to generate distributed sentence representations and tackle diverse NLP tasks. In this paper, we present a NLI dataset, tailored specifically to software engineering, empowers neural network models to effectively handle NLP tasks in this domain. The creation of this dataset involved meticulous annotation and careful consideration of diverse sources, including software documentation, user guides, App reviews and different articles related to software systems. Our dataset maintains compatibility with existing NLI datasets like Stanford Natural Language Inference, facilitating seamless adaptation of models without additional preprocessing.
Files
TrainNLI.txt
Files
(376.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:c0f7548839d2bbdbd6e6f650d13c5848
|
376.6 MB | Preview Download |
Additional details
Related works
- Is cited by
- 10.5281/zenodo.8025053 (DOI)