wragge/trove-newspaper-totals-historical: v1.0.0
Creators
Description
This repository contains past harvests of the number of digitised newspaper articles available through Trove. These harvests were created between 2011 and 2022:
- 12 April 2011
- 4 August 2011
- 12 September 2014
- 29 November 2015
- 14 December 2016
- 28 July 2019
- 10 July 2020
- 27 April 2021
- 21 January 2022
It's possible I might find additional harvests and add them to the repository in the future.
Since April 2022, datasets have been automatically created every week and saved in this repository.
Dataset details
The datapackage.json
file contains a description of all the datasets using the Frictionless Data standard.
Datasets are saved in the data
directory as CSV files. There are two types of harvest – one captures the total number of articles per year, while the other breaks the totals down by year and state. The harvest date is embedded in the file title (in YYYYMMDD
format).
total_articles_by_year_YYYYMMDD.csv
These datasets are saved as CSV files containing the following columns:
year
: year of original publication of newspaper articletotal
: total number of articles from that year available in Trove
total_articles_by_year_and_state_YYYYMMDD.csv
These datasets are saved as CSV files containing the following columns:
state
: state in which newspaper article was originally publishedyear
: year of original publication of newspaper articletotal
: total number of articles from that year and state available in Trove
Trove uses the following values for state
:
- ACT
- International
- National
- New South Wales
- Northern Territory
- Queensland
- South Australia
- Tasmania
- Victoria
- Western Australia
Method
The method for harvesting this data has changed over time. Harvests from 2011 were screen scraped from the Trove website. Harvests after 2012 make use of the year
and state
facets from the Trove API. The data was stored in a variety of locations, such as this archived page, my Plotly account, and the Trove Newspapers GLAM Workbench repository. To create this repository, I've retrieved the harvested data from these locations and converted the datasets to CSV files. Column headings have been normalised, but none of the values have been changed.
For current examples of harvesting this sort of data see Visualise the total number of newspaper articles in Trove by year and state in the GLAM Workbench.
Files
wragge/trove-newspaper-totals-historical-v1.0.0.zip
Files
(93.1 kB)
Name | Size | Download all |
---|---|---|
md5:a6b6c7163f805256c3d986fbc1f3ce69
|
93.1 kB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/wragge/trove-newspaper-totals-historical/tree/v1.0.0 (URL)