Images from Newspaper Navigator predicted as maps, with human corrected labels

van Strien, Daniel

doi:10.5281/zenodo.4156510

Published October 30, 2020 | Version 0.1

Dataset Open

Images from Newspaper Navigator predicted as maps, with human corrected labels

van Strien, Daniel¹

1. British Library

Contributors

Data curator:

van Strien, Daniel¹

1. British Library

The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/).

[The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project.

source: https://news-navigator.labs.loc.gov/

One of these categories is 'maps'. In the original training data for Newspaper Navigator, there were relatively few labelled examples of maps. The predictions for maps have an Average Precision of 69.5%, and 34 images in the validation data.

This dataset contains a sample of these images which have been predicted as 'maps'. It also includes additional labels which indicate whether the predicted map image is a 'map' or 'not a map'.

The data is organised as follows:

The images themselves can be found in 'newspaper_maps.zip'
`2020_30_10_13_19_228_sample.json` contains metadata about each image drawn from the Newspaper Navigator Dataset.
map_labels.csv contains the labels for the images as a CSV file

Files

2020_30_10_13_19_228_sample.json

Files (315.5 MB)

Name	Size	Download all
2020_30_10_13_19_228_sample.json md5:7cee61cbb6536072f196c720eb42f342	1.2 MB	Preview Download
LICENSE.txt md5:073166a882909625d915271f737ae197	157 Bytes	Preview Download
map_labels.csv md5:899ce40cf7e19d76b19cd1454a5ecc2c	22.3 kB	Preview Download
newspaper_maps.zip md5:ea81ef0b964ff9295d19e7ba42b51540	314.3 MB	Preview Download

Additional details

Is derived from: Preprint: https://arxiv.org/abs/2005.01583 (URL)

UK Research and Innovation
Living with Machines AH/S01179X/1

	All versions	This version
Views	621	621
Downloads	784	715
Data volume	195.0 GB	168.9 GB

Contributors

Data curator:

2020_30_10_13_19_228_sample.json

Files (315.5 MB)

Related works

Funding

Images from Newspaper Navigator predicted as maps, with human corrected labels

Authors/Creators

Contributors

Data curator:

Description

Files

2020_30_10_13_19_228_sample.json

Files (315.5 MB)

Additional details

Related works

Funding