Published December 19, 2018
| Version SPGC-2018-07-18
Dataset
Open
Standardized Project Gutenberg Corpus
Creators
- 1. Department of Chemical and Biological Engineering, Northwestern University
- 2. Center for Complexity and Biosystems, Department of Physics, University of Milan
Description
Standardized Project Gutenberg Corpus
version: SPGC-2018-07-18
number of books: 55905
uncompressed size: 3GB (counts) + 18GB (tokens)
Publication
https://arxiv.org/abs/1812.08092
[ journal link ]
Project Site
https://pgcorpus.github.io/