Published January 24, 2020
| Version 1.0
Dataset
Open
Resources for reproducing experiments in "Novel Entity Discovery from Web Tables"
- 1. Bloomberg
- 2. University of Stavanger
Description
This repository contains resources developed for the paper: "S. Zhang, E. Meij, K. Balog, and R. Reinanda. Novel Entity Discovery from Web Tables. In: Proceeding of the The Web Conference 2020 (WWW ’20), April 2020".
It includes the three test collections for novel entity discovery for Web tables, entity type and mention resolution, as well as the mention-entity and heading-property correspondences for 3M tables. The cited datasets were used in this work.
Files to recreate the entity linking experiments:
- training_el.csv
- training_el_type.csv
- training_el_type_wiki.csv
- training_el_wiki.csv
- training_schema.csv
Files to recreate the table matching experiments:
- me_corres.csv - textual cells algorithmically linked to Wikipedia entities
- hp_corres.csv - same but only table headings
Files to recreate the entity resolution experiments:
- ec_golden.csv - 20K unlinked mentions textual cells, manually linked to Wikipedia
- er_sf_golden.csv - 1K cell values, manually clustered
- er_type_golden.csv - 1K cell values, manually linked to DBpedia types
Files
www2020-webtables-v1.0.zip
Files
(466.1 MB)
Name | Size | Download all |
---|---|---|
md5:f389e09f86080d83f76ff24d777d9b7f
|
466.1 MB | Preview Download |