Published September 23, 2021 | Version 1.0.0
Dataset Restricted

Euronews XML corpus

Description

The Euronews XML corpus comprises the transcription and XML encoding of handwritten newsletters, ranging between 1550 and 1730 and preserved today within the Florence State Archives. The manuscript newsletters, also called avvisi in Italian, are a Renaissance invention consisting of usually anonymous sheets, reproduced in multiple copies, which eventually became the basis of the first printed journalism.

The Euronews project team built a methodology to encode this type of early modern informative source and to create a corpus usable for data analytics. The transcription and XML encoding guidelines are explained in detail at this page: https://github.com/lallori/euronews-xml-corpus/wiki/transcription-xml-encoding-guidelines

The main language of the documents transcribed and encoded in the corpus is Italian (XVI-XVII century).

The Euronews Project is funded by the Irish Research Council, through IRCLA/2019/41 and is hosted by University College Cork in collaboration with the Medici Archive Project. 

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.