Published October 14, 2021 | Version 0.0.1
Dataset Open

19th Century United States Newspaper Advert images with 'illustrated' or 'non illustrated' labels

  • 1. British Library


Data collector:


The Dataset contains images derived from the Newspaper Navigator (, a dataset of images drawn from the Library of Congress Chronicling America collection ( 

[The Newspaper Navigator dataset] consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project.


One of these categories is 'advertisements. This dataset contains a sample of these images with additional labels indicating if the advert is 'illustrated' or 'not illustrated'.

The data is organised as follows:

  • The images themselves can be found in ``
  • `newspaper-navigator-sample-metadata.csv` contains metadata about each image drawn from the Newspaper Navigator Dataset.
  • `ads.csv` contains the labels for the images as a CSV file
  • `sample.csv` contains additional metadata about the images (based on the newspapers those images came from). 

This dataset was created for use in an under-review Programming Historian tutorial ( The primary aim of the data was to provide a realistic example dataset for teaching computer vision for working with digitised heritage material. The data is shared here since it may be useful for others. This data documentation is a work in progress and will be updated when the Programming Historian tutorial is released publicly.

The metadata CSV file contains the following columns:

- filepath
- pub_date
- page_seq_num
- edition_seq_num
- batch
- lccn
- box
- score
- ocr
- place_of_publication
- geographic_coverage
- name
- publisher
- url
- page_url
- month
- year
- iiif_url



Files (49.1 MB)

Name Size Download all
47.4 kB Preview Download
68.3 kB Preview Download
48.1 MB Preview Download
876.4 kB Preview Download

Additional details

Related works

Is derived from
Journal article: (URL)
Software: 10.5281/zenodo.5537185 (DOI)


Living with Machines AH/S01179X/1
UK Research and Innovation