Published October 15, 2025 | Version 1.0
Dataset Open

LibMovIt: text corpus of travel literature

  • 1. ROR icon Istituto per il Lessico Intellettuale Europeo e la Storia delle Idee
  • 2. ROR icon National Research Council

Contributors

  • 1. ROR icon National Research Council
  • 2. ROR icon Istituto per il Lessico Intellettuale Europeo e la Storia delle Idee

Description

LibMovIt: text corpus of travel literature is a textual resource created within the LibMovIt project (Progetto  finanziato  dall’Unione  Europea  –  NextGenerationEU  a  valere  sul  Piano  Nazionale  di  Ripresa  e  Resilienza  (PNRR) – Missione 4 Istruzione e ricerca – Componente 2 Dalla ricerca all’impresa – Investimento 1.1, Avviso Prin 2022 indetto con DD N. 104 del 2/2/2022, Progetto dal titolo LIBMOVIT – Libraries on the move: scholars, books, ideas trave-ling in Italy in the 18th century, codice proposta 2022CP88KY).

Version 1.0 of the corpus contains 52 works for 7,9 milion words: 27 in English (3,590,000 words), 12 in French (2,065,000), 8 in German (1,550,000), 4 in Italian (450,000) and 1 in Spanish (255,000). A detailed description of the corpus and the status of each text (tags: "Revision completed" or "Revision to be completed") is available in a Zotero library at the following link: https://www.zotero.org/groups/5540957/libmovit/library

Texts are published in .txt format. They are the result of both automatic text recognition and acquisition from other projects (indicated as tags in the corpus description). All the newly recognised texts have been reviewed with scripts to clean up the most common errors and delete paratextual elements (page numbers, catchwords, signature marks etc.). The editors of the corpus made manual corrections in all the texts, however, due to their lenghth, some of them still need further revision. 

For this reason, minor updates of the corpus (1.1, 1.2, 1.3 etc.) will be released regularly to improve the texts with further corrections, text mark-up and conversion to other formats; a major update of the corpus will be released once a year and will also include new texts.

Additional information about the corpus development are described in the papers listed in the references section.

Files

LibMovItCorpus_1.0.zip

Files (47.5 MB)

Name Size Download all
md5:74ea4ff7b5def77da72bc0d6da278df3
47.5 MB Preview Download

Additional details

Related works

Funding

Ministero dell'università e della ricerca
LIBMOVIT – Libraries on the move: scholars, books, ideas trave-ling in Italy in the 18th century 2022CP88KY

Dates

Issued
2025-10-16

Software

Development Status
Active

References

  • L. Mancini, La letteratura di viaggio tra corpora e analisi computazionali: primi risultati e prospettive future, in "Biblioteche in movimento: studiosi, idee, libri in viaggio nel XVIII secolo", Milano, Ledizioni, 2025, p. 297-309
  • L. Mancini – S. Congregati, Da lontano e da vicino: un duplice approccio allo studio della letteratura di viaggio nel progetto LibMovIt, in "In viaggio nella città del libro. La storia delle biblioteche veneziane e il progetto PRIN2022 LIBMOVIT", Milano, Ledizioni, 2025, p. 81-98