Neural Language Models for Nineteenth-Century English (dataset; language model zoo)

Hosseini, Kasra; Beelen, Kaspar; Colavizza, Giovanni; Coll Ardanuy, Mariona

doi:10.5281/zenodo.4779091

There is a newer version of the record available.

Published May 21, 2021 | Version 1.0.0

Dataset Open

Neural Language Models for Nineteenth-Century English (dataset; language model zoo)

1. The Alan Turing Institute, London, UK
2. University of Amsterdam, Institute for Logic, Language and Computation, Netherlands

This dataset contains four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include static (word2vec and fastText) and contextualized models (BERT and Flair).

Github repository: https://github.com/Living-with-machines/histLM.

Files

bert.zip

Files (13.2 GB)

Name	Size	Download all
bert.zip md5:fea637f1dd685fef5301490ee9cffbb0	2.0 GB	Preview Download
fasttext.zip md5:f60c2b92ea99e6e2245bbbaca82b427f	8.5 GB	Preview Download
flair.zip md5:0f29ad54b98a841fe57e7e5b003b180c	71.0 MB	Preview Download
README.md md5:f436627a9bba8f53174d0975c36fc72a	3.1 kB	Preview Download
word2vec.zip md5:47f7ff9d77bf61ff2a20d7c641ca38af	2.6 GB	Preview Download

Additional details

UK Research and Innovation
Living with Machines AH/S01179X/1
UK Research and Innovation
The Alan Turing Institute EP/N510129/1

Views

943

Downloads

Show more details

	All versions	This version
Views	2,265	309
Downloads	943	112
Data volume	2.6 TB	262.1 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: May 21, 2021
Modified: May 23, 2021

Neural Language Models for Nineteenth-Century English (dataset; language model zoo)

Authors/Creators

Description

Files

bert.zip

Files (13.2 GB)

Additional details

Funding