GloVe 6B Vectors

Liebl Bernhard

doi:10.5281/zenodo.4925376

Published June 10, 2021 | Version 1.2

Dataset Open

GloVe 6B Vectors

Liebl Bernhard¹

1. Leipzig University

GloVe 6B word embeddings from https://nlp.stanford.edu/projects/glove/ (Wikipedia 2014 + Gigaword 5: 6B tokens, 400K vocab, uncased, 50d, 100d, 200d, & 300d vectors), split into single files, converted to gensim's binary word2vec format and zip-compressed (LZMA).

Splitting this data into single files allows for faster downloads and inclusion in memory-restricted environments such as Binder.

To load these vectors, use gensim.models.KeyedVectors.load_word2vec_format(path, binary=True). To uncompress, use Python's zipfile.ZipFile.

This data is made available under the Public Domain Dedication and License v1.0 whose full text can be found at: http://www.opendatacommons.org/licenses/pddl/1.0/.

Files

glove.6B.100d.zip

Files (740.6 MB)

Name	Size	Download all
glove.6B.100d.zip md5:f19871e3053750198004fb2acc1f8d44	115.1 MB	Preview Download
glove.6B.200d.zip md5:1df2c1a318572f7b9505e1a962b7a052	227.2 MB	Preview Download
glove.6B.300d.zip md5:dc1af9ca593acdbf870759f0cf9ced99	339.4 MB	Preview Download
glove.6B.50d.zip md5:a6c8d6e1e52401e913e5f6fa137b1d53	58.9 MB	Preview Download

	All versions	This version
Views	3,710	3,703
Downloads	1,649	1,647
Data volume	360.4 GB	359.9 GB

GloVe 6B Vectors

Creators

Description

Files

glove.6B.100d.zip

Files (740.6 MB)