Published December 10, 2024 | Version v1
Dataset Open

Data for Common Sense-Violating Bugs Empirical Study

Authors/Creators

  • 1. ROR icon Beijing Institute of Technology

Description

Data for Common Sense-Violating Bugs Empirical Study

Research Context

Paper Title: An Empirical Study on Common Sense-Violating Bugs in Mobile Apps

Authors: Fu Fan, Yanjie Jiang, Tianyi Chen, Hengshun Zhang, Yuxia Zhang, Nan Niu, Hui Liu

Publication: ACM Transactions on Software Engineering and Methodology

Data Description

We release these parts of data from our empirical study:

The automatically collected issue dataset

  • File: collected_issues.json

  • The dataset consists of 33,650 issue reports retrieved from open-source Android applications.

  • The JSON data is structured as a list of entries, where each entry includes the issue report's URL, title, body, and additional metadata.

The bug reports analyzed manually in our study

  • File: analyzed_issues.csv

  • This list contains 5,342 bug reports we have manually analyzed.

  • The column "label" could be:

    • "Invalid" (excluded, not a valid bug report);

    • "Negative" (the bug do not violate common sense);

    • "Positive" (common sense-violating bug).

  • The first 130 issues are selected from Andror2+ dataset with the same filtering strategy when constructing our dataset.

The common sense principles violated by the bugs with taxonomy

  • File: common-sense-principles.json

  • 358 common sense principles concluded from our study. Each principle is categorized by its level-1 and level-2 categories and includes links to the corresponding violating bugs.

Application Error Message Classification Dataset

  • File: error_messages_dataset.csv

  • This dataset includes 202 text samples with labels, some of which are derived from real-world bug reports.

The issue reports were retrieved in November 2022 using the GitHub API. The remaining data were generated through manual analysis.

Last Update: 2024-12-10

Files

analyzed_issues.csv

Files (59.2 MB)

Name Size Download all
md5:950154cef142bf78ab77bfcc3cfde638
360.3 kB Preview Download
md5:b533424f2dba9dc58d6711d0a0f8e878
58.6 MB Preview Download
md5:07eb67bdfbbe822d7897ba79cf2a5ac5
177.8 kB Preview Download
md5:169e6690cfd3a581f3fff247cdf16289
19.9 kB Preview Download
md5:a978a750f851c27da16ba81c1fd92065
1.9 kB Preview Download