Published November 4, 2019
| Version v1
Dataset
Open
Monthly word embeddings for Twitter random sample (English, 2012-2018)
Creators
- 1. Alan Turing Institute
Description
This dataset contains monthly word embeddings created from the tweets available via the statuses/sample endpoint of the Twitter Streaming API from 2012 to 2018. Full details of the creation of the dataset are given in Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings.
The md5sum of the gzipped tarball file is a76888ffec8cc7aebba09d365ca55ace .
Files
room2glo-emnlp2019.pdf
Files
(2.7 GB)
Name | Size | Download all |
---|---|---|
md5:66b8b9d6a8aa726109e4b3a0e88a5d51
|
727.8 kB | Preview Download |
md5:a76888ffec8cc7aebba09d365ca55ace
|
2.7 GB | Download |
Additional details
Related works
- Is compiled by
- Conference paper: https://www.aclweb.org/anthology/D19-1007/ (URL)
Funding
- The Alan Turing Institute EP/N510129/1
- UK Research and Innovation