Published June 22, 2021 | Version v1
Dataset Open

A Static-Based Approach to Detect SQL Semantic Bugs Dataset

  • 1. Delft University of Technology

Description

The dataset used for our study: A Static-Based Approach to Detect SQL Semantic Bugs.

This dataset contains more than 172,000 queries extracted from StackOverflow posts. It was built for analysing the prevalence of semantic bugs in SQL queries.

For more information about our study and tools see our GitHub repository: https://github.com/SERG-Delft/sql-bug-finder

Description of included files:

  • sql_db.png: database ER diagram
  • homedb_queries.sql: contains queries extracted from StackOverflow posts
  • homedb_questions.sql: contains SQL related question posts extracted from StackOverflow
  • homedb_answers.sql: contains the answers to SQL related question posts extracted from StackOverflow
  • homedb_bugs.sql: contains queries with semantic bugs extracted from StackOverflow posts
  • homedb_owners.sql: contains data related to the owners (users) of SQL StackOverflow posts
  • homedb_pages.sql: artifact from book-keeping script, tracking the StackOverflow pages from which SQL queries were extracted (SQL tagged pages, ordered by votes in descending order)

Files

sql_db.png

Files (148.9 MB)

Name Size Download all
md5:6bfc9a9ffbf373b59f4026fac3282613
26.9 MB Download
md5:dfa90a14eda74d79f5d4aab81c5122b2
2.8 MB Download
md5:4192659be9321894681abfe8136c95b2
13.1 MB Download
md5:3ba2663ef4043c92e61d77cb59841cd4
22.8 kB Download
md5:d82a307e8b89a2e91e6a50822e3a308f
78.2 MB Download
md5:27ab6ed9565d0b867d0f29dce5cedbcc
27.9 MB Download
md5:466ca3d40322da5448c985d2275695da
40.5 kB Preview Download