Dataset Open Access

Luminoso Input Data for SemEval-2018 Task 10: "Capturing Discriminative Attributes"

Speer, Robyn; Lowry-Duda, Joanna

This is the data required to run Luminoso's entry to the SemEval-2018 task on Capturing Discriminative Attributes.

This data includes:

  • A recently-computed version of the ConceptNet Numberbatch word embeddings
  • The output of an implementation of Semantic Matching Energy over ConceptNet
  • A SQLite database containing the lead section of all articles on the English Wikipedia on 2017-12-20
  • The text file that that database is constructed from
  • A SQLite database of words that co-occur in Google Books 2-grams
  • The text file containing total counts of 2-grams in the Google Books data, which that database is constructed from

For more information, see the paper "Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge", by Robyn Speer and Joanna Lowry-Duda, to appear in the proceedings of the SemEval workshop at NAACL 2018.

Files (8.8 GB)
Name Size
8.8 GB Download
All versions This version
Views 335335
Downloads 9090
Data volume 789.4 GB789.4 GB
Unique views 316316
Unique downloads 7272


Cite as