Published November 4, 2019 | Version v1
Dataset Open

Monthly word embeddings for Twitter random sample (English, 2012-2018)

Description

This dataset contains monthly word embeddings created from the tweets available via the statuses/sample endpoint of the Twitter Streaming API from 2012 to 2018. Full details of the creation of the dataset are given in Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings

The md5sum of the gzipped tarball file is a76888ffec8cc7aebba09d365ca55ace .

Files

room2glo-emnlp2019.pdf

Files (2.7 GB)

Name Size Download all
md5:66b8b9d6a8aa726109e4b3a0e88a5d51
727.8 kB Preview Download
md5:a76888ffec8cc7aebba09d365ca55ace
2.7 GB Download

Additional details

Related works

Is compiled by
Conference paper: https://www.aclweb.org/anthology/D19-1007/ (URL)

Funding

The Alan Turing Institute EP/N510129/1
UK Research and Innovation