Dataset Open Access

Luminoso Input Data for SemEval-2018 Task 10: "Capturing Discriminative Attributes"

Speer, Robyn; Lowry-Duda, Joanna

This is the data required to run Luminoso's entry to the SemEval-2018 task on Capturing Discriminative Attributes.

This data includes:

  • A recently-computed version of the ConceptNet Numberbatch word embeddings
  • The output of an implementation of Semantic Matching Energy over ConceptNet
  • A SQLite database containing the lead section of all articles on the English Wikipedia on 2017-12-20
  • The text file that that database is constructed from
  • A SQLite database of words that co-occur in Google Books 2-grams
  • The text file containing total counts of 2-grams in the Google Books data, which that database is constructed from

For more information, see the paper "Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge", by Robyn Speer and Joanna Lowry-Duda, to appear in the proceedings of the SemEval workshop at NAACL 2018.

Files (8.8 GB)
Name Size
luminoso-semeval2018-task10-data.zip
md5:ac4dc9c7df25068f47fe8e97cf6f4a37
8.8 GB Download
84
30
views
downloads
All versions This version
Views 8484
Downloads 3030
Data volume 263.1 GB263.1 GB
Unique views 7575
Unique downloads 1919

Share

Cite as