Dataset Open Access

Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and KDE

Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V.

We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects. 

File Descriptions

  • apache.csv - Apache Defect Rediscovery dataset
  • eclipse.csv - Eclipse Defect Rediscovery dataset
  • kde.csv - KDE Defect Rediscovery dataset

 

  • apache.relations.csv - Inter-relations of rediscovered defects of Apache
  • eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse
  • kde.relations.csv - Inter-relations of rediscovered defects of KDE

 

  • create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping
  • create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files
  • rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database

 

  • neo4j_examples.txt - Sample Neo4j queries
  • mysql_examples.txt - Sample MySQL queries
  • rediscovery_eclipse_6325.png - Output of Neo4j example #1

 

  • distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project

Files (180.5 MB)
Name Size
apache.csv
md5:78103a94cc781d8fac2a50b248aada90
6.6 MB Download
apache.relations.csv
md5:4bfa0caeabbc218bd363a9f7f0ae6a39
42.0 kB Download
create_and_populate_mysql_objects.sql
md5:4dd326605ce046fe457ab9c52eeb4aba
3.5 kB Download
create_and_populate_neo4j_objects.cypher
md5:2648fd5b4130e771253966d21a853bc1
2.6 kB Download
distinct_attrs.csv
md5:96f549466cfed59c9bfd434146081976
959 Bytes Download
eclipse.csv
md5:e9d105f93fc450e2d6f865f64d9da0ce
77.3 MB Download
eclipse.relations.csv
md5:66beef2b156ad622bbb39900c0899f5b
698.4 kB Download
kde.csv
md5:811fcb9a018683b4d7d148e90be2ed2a
55.8 MB Download
kde.relations.csv
md5:3d80c8e6242a8633e4c89d66ce2b7028
1.1 MB Download
mysql_examples.txt
md5:9c026c9a79ca3bc643a3528fb95762cd
6.8 kB Download
neo4j_examples.txt
md5:33052e62fc595249b2f2de06bd6a360b
9.4 kB Download
README.txt
md5:8093c55d75748f787118518abd91bae3
1.4 kB Download
rediscovery_db_mysql_dump.sql.zip
md5:7fc3bd8cceee140be05b8e93bcda0a7d
39.0 MB Download
rediscovery_eclipse_6325.png
md5:fecb7579e4bad010cc1c6b7302b8749a
56.7 kB Download
155
239
views
downloads
All versions This version
Views 155155
Downloads 239239
Data volume 4.3 GB4.3 GB
Unique views 147147
Unique downloads 160160

Share

Cite as