Published November 24, 2022 | Version 1.0
Poster Open

Newspapers & photos connected. Exploring the use of computer vision to gain better access to large heritage collections

Authors/Creators

  • 1. KB, national library of the Netherlands

Description

Poster presented at the Clariah-NDE Conference, 24 November 2022, Utrecht. Related to Newspaper and Photos (2021-2022), a project by Picturae B.V., Sioux Technologies, Groninger Archieven, Noord-Hollands Archief and KB, national library of the Netherlands. For more info about the project and the results, please check https://www.krant-en-fotos.nl/ (website in Dutch).

About the poster 

Question: How could we connect press photos to newspaper photos?

Method & techniques: developing an algorithm based on convolutional neural network VGG16, Training set: pretrained on 1M+ images of ImageNet. Testset: 0,5M Press photos, 0,25M Newspaper photos.

Results

  • Software for image recognition (GitHub)
  • Dataset with newspaper photos linked to press photos. 4% of photos connected with accuracy of 99% = 30,000 connections
  • Demo www.krant-en-fotos.nl with interactive connections between photos
  • Lessons learned in whitepaper for the cultural heritage sector 

Future work 

  • For higher accuracy of links, further development of algorithm is needed, a.o. by using metadata
  • Scaling up: already 5 archives with press photo collections have shown interest in a follow-up
  • Digitisation of more newspapers is taking place right now
  • Implementing the functionality in existing newspaper and image repositories will make it more sustainable

Poster design: Dorien Haagsma.

Files

Krantenfotos_final_a1.pdf

Files (7.1 MB)

Name Size Download all
md5:43fc74972a07e5ad8b0e37a658179d4b
7.1 MB Preview Download