Published March 15, 2019
| Version v1.0.0
Dataset
Open
Security Bug Conversations
- 1. Rochester Institute of Technology
- 2. Boston College
Description
This dataset will be released as part of the following publication.
- Benjamin S. Meyers, Nuthan Munaiah, Andrew Meneely, and Emily Prud'hommeaux. Pragmatic Characteristics of Security Conversation: An Exploratory Linguistic Analysis. Forthcoming. Proceedings of the 12th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE 2019). Montréal, QC, Canada.
Files:
security_bug_conversations.csv
The full dataset containing over 2.1 million comments posted by developers discussing bugs in the Chromium project. The dataset also includes the values we calculated for the five pragmatic features (described in Section 3 of the paper cited above).
CSV Fields:
- Organizational:
- Bug ID: Unique identifier of a bug discussion in the Chromium project. The URL https://bugs.chromium.org/p/chromium/issues/detail?id=<Bug ID> may be used to access the bug online
- Comment ID: Unique identifier of a comment in a bug discussion
- Classification:
- Is Security: Binary indicator of whether or not a comment is part of a bug that is about security
- Natural Language:
- Comment Text: The raw natural language text of the bug comment
- Linguistic Metrics:
- Min. Formality: Minimum of the formality of sentences in the bug comment
- Max. Formality: Maximum of the formality of sentences in the bug comment
- Max. Informativeness: Maximum of the informativeness of sentences in the bug comment
- Max. Implicature: Maximum of the implicature of sentences in the bug comment
- Min. Politeness: Minimum of the politeness of sentences in the bug comment
- Max. Politeness: Maximum of the politeness of sentences in the bug comment
- Number of Tokens
- Number of Sentences
- Has Doxastic Uncertainty: Binary indicator of presence of a sentence with doxastic uncertainty in the bug comment
- Has Epistemic Uncertainty: Binary indicator of presence of a sentence with epistemic uncertainty in the bug comment
- Has Conditional Uncertainty: Binary indicator of presence of a sentence with conditional uncertainty in the bug comment
- Has Investigational Uncertainty: Binary indicator of presence of a sentence with investigational uncertainty in the bug comment
- Has Uncertainty: Binary indicator of presence of a sentence with any uncertainty in the bug comment
Notes
Files
security_bug_coversations.csv
Files
(1.2 GB)
Name | Size | Download all |
---|---|---|
md5:bd04e9a1c4eeede6d75a44cba283f0c4
|
1.2 GB | Preview Download |