Published February 22, 2018
| Version v1
Dataset
Open
Luminoso Input Data for SemEval-2018 Task 10: "Capturing Discriminative Attributes"
Description
This is the data required to run Luminoso's entry to the SemEval-2018 task on Capturing Discriminative Attributes.
This data includes:
- A recently-computed version of the ConceptNet Numberbatch word embeddings
- The output of an implementation of Semantic Matching Energy over ConceptNet
- A SQLite database containing the lead section of all articles on the English Wikipedia on 2017-12-20
- The text file that that database is constructed from
- A SQLite database of words that co-occur in Google Books 2-grams
- The text file containing total counts of 2-grams in the Google Books data, which that database is constructed from
For more information, see the paper "Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge", by Robyn Speer and Joanna Lowry-Duda, to appear in the proceedings of the SemEval workshop at NAACL 2018.
Files
luminoso-semeval2018-task10-data.zip
Files
(8.8 GB)
Name | Size | Download all |
---|---|---|
md5:ac4dc9c7df25068f47fe8e97cf6f4a37
|
8.8 GB | Preview Download |