Report Open Access

Heritage Connector: A Towards a National Collection Foundation Project Final Report

Winters, Jane; Stack , John; Dutia, Kalyan; Unwin, Jamie; Lewis, Rhiannon; Palmer, Richard; Wolff, Angela

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Winters, Jane</dc:creator>
  <dc:creator>Stack , John</dc:creator>
  <dc:creator>Dutia, Kalyan</dc:creator>
  <dc:creator>Unwin, Jamie</dc:creator>
  <dc:creator>Lewis, Rhiannon</dc:creator>
  <dc:creator>Palmer, Richard</dc:creator>
  <dc:creator>Wolff, Angela</dc:creator>
  <dc:description>Heritage Connector was a Towards a National Collection Foundation Project. The aims of the Heritage Connector (HC) project were to make a substantial contribution to enable realisation of the ambitions within the AHRC’s Towards a National Collection (TaNC) programme to make collections accessible for research and public engagement purposes.

Bringing multiple cultural heritage collections together is fundamentally about building links. Online, these can be manifested as hypertext links which create a rich web of deep and broad user journeys between related content and information. These links also have the potential for computational analysis and visualisation enabling new forms of digital humanities research into collections.

The project explored three technologies that together have the potential to provide a step-change in access and discoverability, research and public engagement by augmenting traditional catalogue data and associated keyword search through generation of a vast number of interlinked resources and content. The three technologies Heritage Connector explored were:

	artificial intelligence (AI) – specifically, natural language processing (NLP), named entity recognition (NER) and entity linking (EL) – to build links at scale from thin collection records;
	linked open data (LOD) as a scalable and flexible structuring methodology;
	knowledge graphs to store links and make them accessible.

The project sought to demonstrate that generation of a rich web of links could be built and made available using these technologies on the following source datasets:

	Science Museum Group (SMG) Collection catalogue,
	Victoria and Albert Museum (V&amp;A) Collection catalogue,
	Science Museum Group Journal,
	Science Museum blog.

The final web of links (structured in the knowledge graph) has 1,208,256 entities and 53 relations. The techniques used to generate the links was tuned in ways which were able to provide high quality links and even though the accuracy of these links in some cases falls short of those generated manually, a greater wealth of associated material is surfaced which has practical benefits.

The software developed by the project can be found in the list of GitHub code repositories in the Annexes to this report. The project’s output datasets, including the knowledge graph and embeddings, are available here on zenodo.</dc:description>
  <dc:subject>Artificial intelligence</dc:subject>
  <dc:subject>Natural language processing</dc:subject>
  <dc:subject>Named entity recognition</dc:subject>
  <dc:subject>Entity linking</dc:subject>
  <dc:subject>Linked open data</dc:subject>
  <dc:subject>Knowledge graphs</dc:subject>
  <dc:title>Heritage Connector: A Towards a National Collection Foundation Project Final Report</dc:title>
All versions This version
Views 2,1472,147
Downloads 1,0041,004
Data volume 3.8 GB3.8 GB
Unique views 1,8831,883
Unique downloads 881881


Cite as