There is a newer version of the record available.

Published June 26, 2022 | Version v1.3.4
Software Open

GLAM-Workbench/trove-newspapers

Description

Current version: v1.3.4

This repository contains Jupyter notebooks to work with data from Trove's newspapers zone. For more information see the Trove Newspapers section of the GLAM Workbench.

Notebook topics Trove newspapers in context
  • Visualise the total number of newspaper articles in Trove by year and state – explore how Trove's newspaper articles are distributed over time, and by state
  • Analyse rates of OCR correction – explore patterns in OCR text correction; how many corrections are there and where have they been made?
  • Finding non-English newspapers in Trove – use automated language detection to identify non-English language newspapers in Trove
  • Beyond the copyright cliff of death – find newspapers with content published after 1954
  • Gathering historical data about the addition of newspaper titles to Trove – find when newspaper titles were added to Trove by extracting lists from web archives
Visualising searches
  • QueryPic – simple app to visualise newspaper searches over time, this is the latest version with many new features
  • QueryPic Deconstructed – an older version of QueryPic that lets you build queries using keywords, states, or newspapers
  • Visualise Trove newspaper searches over time – use facets to slice up newspaper search results and visualise over time
  • Map Trove newspaper results by state – create a choropleth map to visualise search results by state
  • Map Trove newspaper results by place of publication – links newspapers to their place of publication and maps the results
  • Map Trove newspaper results by place of publication over time – adds a time dimension to the example above
Harvesting data

See the Trove Newspaper and Gazette Harvester if you want to harvest all the articles from a search.

  • Harvest information about newspaper issues – get information about available issues for each newspaper from the Trove API
  • Harvest the issues of a newspaper as PDFs – harvest available issues of a newspaper as PDFs
  • Harvest Australian Women's Weekly covers (or the front pages of any newspaper) – harvest the front pages of any newspaper, including covers from the Australian Women's Weekly
Useful tools
  • Save a Trove newspaper article as an image – grabs the page on which an article was published, and then crops the page image to the boundaries of the article to create a complete, intact image of the article as it was originally published
  • Download a page image – a simple app that lets you download page images as complete, high-resolution JPG files
  • Generate an article thumbnail – generate a nice square thumbnail image for a newspaper article
  • Upload Trove newspaper articles to Omeka-S – steps through the process of uploading Trove newspaper articles to your own Omeka-S instance via the API
Tips and tricks
  • Today's news yesterday – uses the date index and the firstpageseq parameter to find articles from exactly 100 years ago that were published on the front page
  • Create a Trove OCR corrections ticker – uses the has:corrections parameter to get the total number of newspaper articles with OCR corrections
  • Get a list of Trove newspapers that doesn't include government gazettes – workaround for a problem with the newspaper/titles endpoint of the API
  • Get the page coordinates of a digitised newspaper article from Trove – demonstrates how to find the coordinates of a newspaper article on a digitised page
Get creative
  • Make composite images from lots of Trove newspaper thumbnails – creates thumbnails from a search and compiles them into a mega image
  • Create 'scissors and paste' messages from Trove newspaper articles – snip words out of page images and compile them into the message of your choice
  • Create large composite images from snipped words – harvest multiple versions of a list of words and compile them all into one big image

See the GLAM Workbench for more details.

Data files Cite as

See the GLAM Workbench or Zenodo for up-to-date citation details.

This repository is part of the GLAM Workbench.
If you think this project is worthwhile, you might like to sponsor me on GitHub.

Files

GLAM-Workbench/trove-newspapers-v1.3.4.zip

Files (25.7 MB)

Name Size Download all
md5:3993d49ff5f87456cbbc4b9b2361c7a7
25.7 MB Preview Download

Additional details

Related works

Is derived from
Software: https://github.com/GLAM-Workbench/trove-newspapers/tree/v1.3.4 (URL)
Is documented by
Software documentation: https://glam-workbench.github.io/trove-newspapers/ (URL)
Is part of
Other: https://glam-workbench.github.io/ (URL)