Published April 3, 2020 | Version 5.8
Dataset Open

ConceptNet 5.x Raw Data

  • 1. Luminoso
  • 1. Luminoso
  • 2. MIT
  • 3. Kyoto University

Description

This archive contains the raw data that ConceptNet 5 is built from. More information about ConceptNet is available at http://conceptnet.io.

If you use ConceptNet as part of another work, you must attribute ConceptNet and you must not restrict its license terms. For more license information: https://creativecommons.org/licenses/by-sa/4.0/

ConceptNet has been developed by:

* The MIT Media Lab, through various groups at different times:

  - Commonsense Computing
  - Software Agents
  - Digital Intuition

* The Commonsense Computing Initiative, a worldwide collaboration with
  contributions from:

  - National Taiwan University
  - Universidade Federal de São Carlos
  - Hokkaido University
  - Tilburg University
  - Nihon Unisys Labs
  - Dentsu Inc.
  - Kyoto University
  - Yahoo Research Japan

* Luminoso Technologies, Inc.

Significant amounts of data were imported from:

* WordNet, a project of Princeton University
* Wikipedia and Wiktionary, collaborative projects of the Wikimedia Foundation
* Luis von Ahn's "Games with a Purpose"
* DBPedia
* OpenCyc
* JMDict, by Jim Breen

ConceptNet also takes input from these sources of distributional word embeddings:

ConceptNet takes input from these sources of pre-computed distributional word embeddings:

- GloVe: Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation.
 https://nlp.stanford.edu/projects/glove/

- word2vec: Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In Computing Research Repository. http://dblp.org/rec/bib/journals/corr/abs-1301-3781

- fastText: Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. http://fasttext.cc
 

Here is a short, incomplete list of people who have made significant
contributions to the development of ConceptNet as a data resource, roughly in
order of appearance:

* Push Singh
* Catherine Havasi
* Hugo Liu
* Hyemin Chung
* Robyn Speer
* Ken Arnold
* Yen-Ling Kuo
* Naoki Otani
 

Files

conceptnet-raw-data-5.7.zip

Files (47.1 GB)

Name Size Download all
md5:4ee99c44d916a49effddafbf12a72593
24.6 GB Preview Download
md5:97c03f9ed09feb688e658f54b373527e
22.5 GB Preview Download