Dataset Open Access

Wikitext-103 and OpenWebText Models

Davis, Forrest

This repository contains 25 Wikitext-103 LSTM models and 25 LSTM models trained on a 100 million token subset of the OpenWebTextCorpus. Training/validation/test data is included with the Web models. By-epoch validation perplexity is given in the logs (within the directory for the models). Please write to me if you have any questions :) 

Files (4.4 GB)
Name Size
openwebtextcorpus-25-models.tar.gz
md5:21616173d195a6c1f19fd447fad41c65
2.3 GB Download
wikitext103-25-models.tar.gz
md5:189990cac92603769d9d2c4531e5aa9b
2.1 GB Download
100
25
views
downloads
All versions This version
Views 100100
Downloads 2525
Data volume 54.7 GB54.7 GB
Unique views 9090
Unique downloads 1616

Share

Cite as