Don't count, predict! Semantic vectors

Baroni, Marco; Dinu, Georgiana; Kruszewski, Germán

doi:10.5281/zenodo.2635544

Published June 1, 2014 | Version v1

Dataset Open

Don't count, predict! Semantic vectors

1. University of Trento

Semantic vectors associated with the paper "Don't count, predict! A systematic comparison of context-counting vs context-predicting semantics vectors"

Abstract: context-predicting models (more commonly known as embeddings or neural language models) are the new kids on the distributional semantics block. Despite the buzz surrounding these models, the literature is still lacking a systematic comparison of the predictive models with classic, count-vector-based distributional semantic approaches. In this paper, we perform such an extensive evaluation, on a wide range of lexical semantics tasks and across many parameter settings. The results, to our own surprise, show that the buzz is fully justified, as the context-predicting models obtain a thorough and resounding victory against their count-based counterparts.

Files

Files (2.8 GB)

Name	Size	Download all
additional.tar.gz md5:18225022479202eedcaa83175a85b69d	6.3 kB	Download
EN-wform.w.2.ppmi.svd.500.txt.gzaa md5:a3152a86b1cf462241a943e26582a849	500.0 MB	Download
EN-wform.w.2.ppmi.svd.500.txt.gzab md5:05ac800eb85f5017cdaefc2597552e64	500.0 MB	Download
EN-wform.w.2.ppmi.svd.500.txt.gzac md5:f1c2bd158db8755510edfd0ef90b1861	428.6 MB	Download
EN-wform.w.2.ppmi.txt.gzaa md5:213e19cf9c6a5f266650b2ffe0fca0c5	500.0 MB	Download
EN-wform.w.2.ppmi.txt.gzab md5:66b31c5f2032a96bfefd63d32fe2e6ee	331.7 MB	Download
EN-wform.w.5.cbow.neg10.400.subsmpl.txt.gz md5:91523ff5c4d31c8e24b0f4c79a541800	562.9 MB	Download

Additional details

European Commission
COMPOSES - Compositional Operations in Semantic Space 283554

	All versions	This version
Views	2,402	2,390
Downloads	1,751	1,743
Data volume	1.0 TB	1.0 TB

Don't count, predict! Semantic vectors

Creators

Description

Files

Files (2.8 GB)

Additional details

Funding