Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.
Published March 11, 2021 | Version v3.0.0
Dataset Open

ChroniclItaly 3.0. A deep-learning, contextually enriched digital heritage collection of Italian immigrant newspapers published in the USA, 1898-1936.

  • 1. Centre for Contemporary and Digital History (C2DH) - University of Luxembourg


This open access collection includes the digitized front pages of 10 Italian language newspapers published in California, Connecticut, Pennsylvania, Vermont, and West Virginia. It totals 8,653 issues and contains 21,454,455 words. The titles are: L’ItaliaCronaca sovversivaLa libera parolaThe patriotLa ragioneLa rassegnaLa sentinella del West VirginiaL’IndipendenteLa Sentinella, and and La Tribuna del Connecticut. The material was collected from Chronicling America, an Internet-based, searchable database of U.S. newspapers published in the United States from 1789 to 1963 made available by the Library of Congress. The corpus features mainstream (prominenti), anarchic (sovversivi), and independent newspapers thus providing a very nuanced picture of the Italian immigrant community in the United States at the turn of the twentieth century. To promote transparency, the collection includes two versions of ChroniclItaly 3.0: unprocessed (as it was collected from Chronicling America) and processed (with pre-processing interventions). Users can also find the data-sets including all the outputs from all the enrichment steps and post-intervention: named entity recognition (NER), geo-coding, sentiment analysis, and network analysis in addition to the readme.txt file that helps users navigate the folders and the metadata file containing relevant information. The code used to perform all the interventions is available at this GitHub repository Finally, all the enrichment outputs can be explored in the interactive app DeXTER available at


Files (592.3 MB)

Name Size Download all
96.8 MB Preview Download
84.9 MB Preview Download
749 Bytes Preview Download
20.8 kB Download
28.1 kB Download
228.7 MB Preview Download
98.9 MB Preview Download
57.8 MB Preview Download
19.0 MB Preview Download
6.0 MB Download

Additional details