Published June 2, 2022 | Version 1.0
Dataset Open

PLANTAS Dataset

  • 1. Pattern Recognition and Human Language Research Center

Description

The dataset "PLANTAS" (“Historia de las plantas”, Vol.1) were written using a quill-pen by Bernardo de Cienfuegos, one of the most outstanding Spanish botanists in the XVII century. The book was writing mainly in Spanish, but a significant number of words and full sentences are in Latin and many other languages. The originals of PLANTAS are currently available at the "Biblioteca Nacional de España", and a digital reproduction of it can be found at the "Biblioteca Digital Hispánica" (http://bdh-rd.bne.es/viewer.vm?id=0000140162). In this dataset, only the first volume of PLANTAS (Mss 3357, with 1,035 pages and around 20,000 handwritten text lines) was considered.

Files

Files (295.3 MB)

Name Size Download all
md5:06914281caa0aa26ba7b8588d8161ac3
295.3 MB Download

Additional details

Funding

European Commission
READ - Recognition and Enrichment of Archival Documents 674943

References

  • Alejandro H Toselli, Luis A Leiva, Isabel Bordes-Cabrera, Celio Hernández-Tornero, Vicent Bosch, Enrique Vidal Digital Scholarship in the Humanities, Volume 33, Issue 1, April 2018, Pages 173–202, https://doi.org/10.1093/llc/fqw064