Published June 6, 2024
| Version v1.2
Dataset
Open
Public tags added to resources in Trove, 2008 to 2024
Creators
Description
This dataset contains details of 2,495,958 unique public tags added to 10,403,650 resources in Trove between August 2008 and June 2024. I harvested the data using the Trove API and saved it as a CSV file with the following columns:
- `tag` – lower-cased text tag
- `date` – date the tag was added
- `zone` – API zone containing the tagged resource
- `record_id` – the identifier of the tagged resource
I've documented the method used to harvest the tags in this notebook.
Using the `zone` and `record_id` you can find more information about a tagged item. To create urls to the resources in Trove:
- for resources in the 'book', 'article', 'picture', 'music', 'map', and 'collection' zones add the `record_id` to `https://trove.nla.gov.au/work/`
- for resources in the 'newspaper' and 'gazette' zones add the `record_id` to `https://trove.nla.gov.au/article/`
- for resources in the 'list' zone add the `record_id` to `https://trove.nla.gov.au/list/`
Notes:
- Works (such as books) in Trove can have tags attached at either work or version level. This dataset aggregates all tags at the work level, removing any duplicates.
- A single resource in Trove can appear in multiple zones – for example, a book that includes maps and illustrations might appear in the 'book', 'picture', and 'map' zones. This means that some of the tags will essentially be duplicates – harvested from different zones, but relating to the same resource. Depending on your needs, you might want to remove these duplicates.
- While most of the tags were added by Trove users, more than 500,000 tags were added by Trove itself in November 2009. I think these tags were automatically generated from related Wikipedia pages. Depending on your needs, you might want to exclude these by limiting the date range or zones.
- User content added to Trove, including tags, is available for reuse under a CC-BY-NC licence.
See this notebook for some examples of how you can manipulate, analyse, and visualise the tag data.
Files
trove_tags_20240606.zip
Files
(154.1 MB)
Name | Size | Download all |
---|---|---|
md5:721e07660e9d56b8bff96411d6a41aab
|
154.1 MB | Preview Download |
Additional details
Related works
- Is compiled by
- Software: https://github.com/GLAM-Workbench/trove-lists (URL)
- Is documented by
- Software documentation: https://glam-workbench.net/trove-lists/ (URL)