Dataset Open Access

Public tags added to resources in Trove, 2008 to 2022

Sherratt, Tim

This dataset contains details of 2,201,090 unique public tags added to 9,370,614 resources in Trove between August 2008 and July 2022. I harvested the data using the Trove API and saved it as a CSV file with the following columns:

  • `tag` – lower-cased text tag
  • `date` – date the tag was added
  • `zone` – API zone containing the tagged resource
  • `record_id` – the identifier of the tagged resource

I've documented the method used to harvest the tags in this notebook.

Using the `zone` and `record_id` you can find more information about a tagged item. To create urls to the resources in Trove:

  • for resources in the 'book', 'article', 'picture', 'music', 'map', and 'collection' zones add the `record_id` to `https://trove.nla.gov.au/work/`
  • for resources in the 'newspaper' and 'gazette' zones add the `record_id` to `https://trove.nla.gov.au/article/`
  • for resources in the 'list' zone add the `record_id` to `https://trove.nla.gov.au/list/`

Notes:

  • Works (such as books) in Trove can have tags attached at either work or version level. This dataset aggregates all tags at the work level, removing any duplicates.
  • A single resource in Trove can appear in multiple zones – for example, a book that includes maps and illustrations might appear in the 'book', 'picture', and 'map' zones. This means that some of the tags will essentially be duplicates – harvested from different zones, but relating to the same resource. Depending on your needs, you might want to remove these duplicates.
  • While most of the tags were added by Trove users, more than 500,000 tags were added by Trove itself in November 2009. I think these tags were automatically generated from related Wikipedia pages. Depending on your needs, you might want to exclude these by limiting the date range or zones.
  • User content added to Trove, including tags, is available for reuse under a CC-BY-NC licence.

See this notebook for some examples of how you can manipulate, analyse, and visualise the tag data.

Files (139.4 MB)
Name Size
trove_tags_20220706.zip
md5:d0aeea9d423794ef253defd6b830f2bd
139.4 MB Download
459
8
views
downloads
All versions This version
Views 45959
Downloads 82
Data volume 1.1 GB278.9 MB
Unique views 29657
Unique downloads 82

Share

Cite as