GLAM-Workbench/trove-web-archives
Creators
Description
CURRENT VERSION: v1.0.0
This repository includes information on finding, understanding, and using Pandora's collections of archived web pages.
Pandora has been selecting web sites and online resources for preservation since 1996. It has assembled a collection of more than 80,000 titles, organised into subjects and collections. The archived websites are now part of the Australian Web Archive (AWA), which combines the selected titles with broader domain harvests, and is searchable through Trove. However, Pandora's curated collections offer a useful entry point for researchers trying to find web sites relating to particular topics or events.
The Web Archives section of the GLAM Workbench provides documentation, tools, and examples to help you work with data from a range of web archives, including the Australian Web Archive. The title urls obtained through Pandora can be used to obtain additional data from the AWA for analysis.
For more information and documentation see the Trove web archive collections (Pandora) section of the GLAM Workbench.
Notebooks
- Create title datasets from collections and subjects
- Harvest Pandora subjects and collections
- Harvest the full collection of Pandora titles
Associated datasets
Created by Tim Sherratt for the GLAM Workbench
Files
GLAM-Workbench/trove-web-archives-v1.0.0.zip
Files
(60.7 kB)
Name | Size | Download all |
---|---|---|
md5:4f8d11853e28897f2dabde9278c76b68
|
60.7 kB | Preview Download |
Additional details
Related works
- Is derived from
- Software: https://github.com/GLAM-Workbench/trove-web-archives/tree/v1.0.0 (URL)
- Is documented by
- Software documentation: https://glam-workbench.net/trove-web-archives/ (URL)
- Is part of
- Other: https://glam-workbench.net/ (URL)