Dataset Open Access
Madelon Hulsebos; Çağatay Demiralp; Paul Demiralp
Note: the download page of the entire GitTables corpus is here: https://zenodo.org/record/4943312.
This dataset represents a small subset of tables from GitTables curated for benchmarking column type detection methods. This benchmark evaluates systems that match table columns to semantic types from the DBpedia and Schema.org ontologies. It is featured in the SemTab 2021 challenge (CTA task).
This dataset consists of the following files:
The labels (semantic types) from each ontology come from:
For the entire GitTables corpus, please refer to this dataset. Visit https://gittables.github.io for more background and contact details.
Name | Size | |
---|---|---|
dbpedia_gt.csv
md5:47914a0569d1bfa7dd2391f6172f2267 |
176.7 kB | Download |
dbpedia_labels.csv
md5:40f5ee3776222a9dfb21cf8f29838862 |
5.7 kB | Download |
dbpedia_targets.csv
md5:e8ad9a2ae93b92ccb4ad516c2d19dda5 |
75.2 kB | Download |
schema_gt.csv
md5:9baf387427a8f543efc71288fcefe170 |
36.1 kB | Download |
schema_labels.csv
md5:86503c8c6ee1e5d513a9d2c51f9dbb3d |
1.5 kB | Download |
schema_targets.csv
md5:9c0c6de86c97632536230180dda6e425 |
20.1 kB | Download |
tables.zip
md5:c61792bc7b77dcf29c165d505a02dea1 |
3.3 MB | Download |
All versions | This version | |
---|---|---|
Views | 1,146 | 1,146 |
Downloads | 1,985 | 1,985 |
Data volume | 930.5 MB | 930.5 MB |
Unique views | 995 | 995 |
Unique downloads | 894 | 894 |