Presentation Open Access
Slides of talk held at the Open Citations workshop (3-5 October 2018, Bologna Italy). Programme of the workshop (link).
Freeing citation data is an issue of essential importance and urgency for scholarly communication, which has been recently getting the attention it deserves thanks to the Initiative for Open Citations. Simultaneously, we should be concerned as well with another complementary issue, that is the necessity to "recover" citation data from digitized publications. This applies especially to the Humanities, where fields often have century-long traditions. Otherwise, we risk to create citation indexes that take into account only recent publications, mostly in English, while a gap ensues for citations buried in older publications. The creation of a comprehensive citation index for the Arts and Humanities, however, is a titanic endeavour which can only be accomplished with a collaborative, distributed approach, where cultural heritage institutions (e.g. libraries, archives, etc.) play a key role. In this talk we present The Venice Scholar, a citation index of literature on the history of Venice, indexing nearly 3000 volumes of scholarship from the mid 19th century to 2013, from which some 4 million bibliographic references have been extracted. The Venice Scholar, to be publicly launched in September 2018, is the first running instance of the Scholar Index, a platform aimed at creating a comprehensive citation index for the Arts and Humanities. This platform consists of two applications, the Scholar Library (SL) and the Scholar Index (SI), both to be released soon under an open source license.The SL is a digital library system where partner institutions can load their digitized scholarly literature. The system embeds the necessary machine learning components to recognize the text from an image (OCR), extract references and link them to unique identifiers, pointing to external resources (e.g. library catalogues). Each partner institution keeps an instance of the digital library system and its own collection.The SI is the global citation index, which federates all citations extracted from different institutions into a unique index, and provides a rich search interface to navigate through the resulting network of citations, with the final aim of interlinking digital archives and digital libraries. In fact, the SI is currently being extended, thanks to an Europeana Reserach Grant, to provide contextual recommendations of related digital objects from Europeana to its users. The citation data underlying the Venice Scholar are modelled using the OpenCitations Data Model, and will use the OpenCitations Corpus as its publication platform, thus enriching this corpus with some 4 million references "recovered" from historical and current publications about the history of Venice. To conclude, we believe that the creation of a citation index for Arts and Humanities can only be accomplished through a collaborative and federated approach, and by leveraging infrastructure synergies, such as the one with the Open Citation Corpus. In this process, libraries and other institutions should take responsibility for specific areas of knowledge (e.g. a journal, a publisher, or a topic) and, at the same time, be facilitated (e.g. through software) in the task of enriching their digitized collections with citation data.