Published June 30, 2016 | Version v1
Poster Open

13. Opening digital archives for research: The Cultural Heritage Cluster of the State and University Library, Denmark

Description

Opening digital archives for research: The Cultural Heritage Cluster of the State and University Library, Denmark

The Cultural Heritage Cluster is High Performance Computing/HPC applied to the digital collections of the State and University Library: Danish newspapers (11 mio. pages from 1700s -), the Danish Netarchive (content of the domains .dk from 2005-), TV and radio collections (2,5 mio. hours of broadcasts from the 1980s -) and more.

Up until now these digital collections have been open for almost entirely qualitative analysis – ‘close reading’ – ie. analysis of content on page level. With the use of HPC quantitative analysis methods are enabled across large numbers of web pages focusing on numerical patterns of content -‘distant reading’. The Cultural Heritage Cluster can thus be regarded as an expansion of the possibilities for researchers especially from the humanities but also from the social sciences to benefit from the growing amount of digital data available. Currently the possibilities are being explored by several pilot projects, among which is Probing a Nation’s Web Domain.

First project: Probing a Nation’s Web Domain

This project will analyze the historical development of the Danish web, what the .dk domains looked like in the past and how they have developed. Concurrently with the project the research infrastructure is developed, i.e. tools and procedures necessary to handle corpus creation, long-term storage, documentation, workspace, and collaborative working tools at the State and University Library.

Bringing Researchers Closer to Data

The Cultural Heritage Cluster is a new service and a new research data infrastructure which will enable researchers to analyze data both drawn ‘on demand’ from the collections of the State and University Library and data generated outside the cluster, such as researchers’ own collected data brought along for analysis. Consequently, the service will include assistance on the use of the library’s own collections such as protection of personal data, sharing of data, copyright and other data management issues as well as on the choice of relevant data processing software. The cluster is an attempt to transform the collections from more or less closed archives to collections open and useful for researchers, hence the combination of access to the archives, software for data analysis and storage facilities. The development of the cluster is promoted by close and active cooperation with the researchers.

The Cultural Heritage Cluster is established in a cooperation with DeIC (Danish e-Infrastructure Cooperation, http://www.deic.dk/node/110?language=en) as the

DeIC National Cultural Heritage Cluster (http://en.statsbiblioteket.dk/kulturarvscluster/deic-nationale-kulturarvscluster).

In addition to the above, the poster will also present some of the possibilities, problems and challenges, which has emerged since autumn 2015 when the infrastructure was established. A brief description of the conditions for access to and use of the cluster and its technical components and capacities will also be included.

Files

13-Opening-Digital-Archives-for-Research_PRINT.pdf

Files (270.0 kB)