gunzip *.gz
. On macOS, please use the build-in "Archive Utility" instead (see this issue description).1_create_database.sql
in your database client (tested on MySQL 5.7) to create the database and tables for the SO dump.2_create_sotorrent_user.sql
to choose a password for the sotorrent user and execute the script to create the user.3_load_so_from_xml.sql
to import the SO dump from the XML files (please use the XML files provided by us, they are processed to be compatible with MySQL).4_create_indices.sql
to create the indices for the SO tables.5_create_sotorrent_tables.sql
to add the SOTorrent tables to the SO database.6_load_sotorrent.sql
to import the SOTorrent tables from the CSV files.7_load_postreferencegh.sql
to import the references from GitHub projects to Stack Overflow questions, answers, or comments.8_load_ghmatches.sql
to import the matched source code lines with Stack Overflow references from GitHub projects.9_create_sotorrent_indices.sql
to create the indices for the SOTorrent tables.The Stack Overflow data has been extracted from the official Stack Exchange data dump released 2018-12-02.
The GitHub references have been retrieved from the Google BigQuery GitHub data set on 2018-12-09.