Published May 14, 2017 | Version 1.1
Dataset Open

dhlab-epfl/LinkedBooksReferenceParsing: 1.1

  • 1. Swiss Federal Institute of Technology, Lausanne (EPFL)
  • 2. Swiss

Contributors

Description

A dataset of annotated references (in both reference lists and footnotes) from journal issues and monographs on the history of Venice, created in the context of the LinkedBooks project (http://dhlab.epfl.ch/page-127959-en.html). The dataset contains annotated reference lists of monographs and annotated references from the footnotes of journal issues from the following journals (mostly, but not exclusively in Italian): Ateneo Veneto, Archvio Veneto, Studi Veneziani. This dataset was digitized, OCRed (using ABBYY FineReader) and annotated (using Brat ADD) from 2014 to 2016. Along the dataset of annotations, a framework to train your own parsers is provided, based on Conditional Random Fields. Feel free to use it to build your own parser, and if you improve on our results, please let us know!

Files

dhlab-epfl/LinkedBooksReferenceParsing-1.1.zip

Files (18.9 MB)

Name Size Download all
md5:169911236679f38441c6c96371ebae70
18.9 MB Preview Download

Additional details