There is a newer version of the record available.

Published September 18, 2022 | Version 0.0.1
Dataset Open

GitTables dataset for SemTab 2022

  • 1. University of Amsterdam

Description

Note: the entire GitTables corpus is here. Visit https://gittables.github.io for more background and contact details.

 

This dataset represents a subset of tables from GitTables curated for benchmarking column type detection methods in round 3 of SemTab 2022.

This dataset consists of the following files:

  • “GitTables_SemTab_2022_dataset.zip”: tables from GitTables used in Round 3 from SemTab 2022. Filenames correspond to table IDs, column names are replaced with "col_0", "col_1", etc. which match to the targets and labels (semantic types) as provided on the main website.

Files

GitTables_SemTab_2022_tables.zip

Files (9.9 MB)

Name Size Download all
md5:75603ae4404a6ec2ba62c075ec1f2dd5
9.9 MB Preview Download