Dataset Open Access
The Dataset contains images derived from the Newspaper Navigator (news-navigator.labs.loc.gov/), a dataset of images drawn from the Library of Congress Chronicling America collection (chroniclingamerica.loc.gov/).
[The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project.
One of these categories is 'photographs'. This dataset contains a sample of these images with additional labels indicating if the photograph has one or more of the following labels: "human", "animal", "human-structure" and "landscape"
The data is organised as follows:
This dataset was created for use in an under-review Programming Historian tutorial (http://programminghistorian.github.io/ph-submissions/lessons/computer-vision-deep-learning-pt2) The primary aim of the data was to provide a realistic example dataset for teaching computer vision for working with digitised heritage material. The data is shared here since it may be useful for others. This data documentation is a work in progress and will be updated when the Programming Historian tutorial is released publicly.
The metadata CSV file contains the following columns:
- filepath
- pub_date
- page_seq_num
- edition_seq_num
- batch
- lccn
- box
- score
- ocr
- place_of_publication
- geographic_coverage
- name
- publisher
- url
- page_url
- month
- year
- iiif_url
Name | Size | |
---|---|---|
annotations.csv
md5:d8db11e5a9943884cfc7e48e8c52a581 |
2.3 MB | Download |
images.zip
md5:6db25b6cbfa4db7c3581bf6004625522 |
880.1 MB | Download |
multi_label.csv
md5:185eb24258049968531e0ab610cb9550 |
168.5 kB | Download |
photo_tasks.json
md5:d79ed84baafb5c8f089ba37614be9335 |
2.6 MB | Download |
results.csv
md5:c2ef929b5f036db1be35028d3345db85 |
2.2 MB | Download |
All versions | This version | |
---|---|---|
Views | 907 | 907 |
Downloads | 75 | 75 |
Data volume | 17.7 GB | 17.7 GB |
Unique views | 861 | 861 |
Unique downloads | 48 | 48 |