Published February 22, 2018 | Version v1
Dataset Open

Luminoso Input Data for SemEval-2018 Task 10: "Capturing Discriminative Attributes"

  • 1. Luminoso

Description

This is the data required to run Luminoso's entry to the SemEval-2018 task on Capturing Discriminative Attributes.

This data includes:

  • A recently-computed version of the ConceptNet Numberbatch word embeddings
  • The output of an implementation of Semantic Matching Energy over ConceptNet
  • A SQLite database containing the lead section of all articles on the English Wikipedia on 2017-12-20
  • The text file that that database is constructed from
  • A SQLite database of words that co-occur in Google Books 2-grams
  • The text file containing total counts of 2-grams in the Google Books data, which that database is constructed from

For more information, see the paper "Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge", by Robyn Speer and Joanna Lowry-Duda, to appear in the proceedings of the SemEval workshop at NAACL 2018.

Files

luminoso-semeval2018-task10-data.zip

Files (8.8 GB)

Name Size Download all
md5:ac4dc9c7df25068f47fe8e97cf6f4a37
8.8 GB Preview Download