Published September 18, 2022
| Version 0.0.1
Dataset
Open
GitTables dataset for SemTab 2022
Description
Note: the entire GitTables corpus is here. Visit https://gittables.github.io for more background and contact details.
This dataset represents a subset of tables from GitTables curated for benchmarking column type detection methods in round 3 of SemTab 2022.
This dataset consists of the following files:
- “GitTables_SemTab_2022_dataset.zip”: tables from GitTables used in Round 3 from SemTab 2022. Filenames correspond to table IDs, column names are replaced with "col_0", "col_1", etc. which match to the targets and labels (semantic types) as provided on the main website.
Files
GitTables_SemTab_2022_tables.zip
Files
(9.9 MB)
Name | Size | Download all |
---|---|---|
md5:75603ae4404a6ec2ba62c075ec1f2dd5
|
9.9 MB | Preview Download |