Pooch v1.0.0: A friend to fetch your data files
Authors/Creators
- 1. Polar Science Center, University of Washington Applied Physics Lab, USA
- 2. Department of Earth, Ocean and Ecological Sciences, School of Environmental Sciences, University of Liverpool, UK
Description
Does your Python package include sample datasets? Are you shipping them with the code? Are they getting too big?
Pooch is here to help! It will manage a data registry by downloading your data files from a server only when needed and storing them locally in a data cache (a folder on your computer).
Here are Pooch's main features:
- Pure Python and minimal dependencies.
- Download a file only if necessary (it's not in the data cache or needs to be updated).
- Verify download integrity through SHA256 hashes (also used to check if a file needs to be updated).
- Designed to be extended: plug in custom download (FTP, scp, etc) and post-processing (unzip, decompress, rename) functions.
- Includes utilities to unzip/decompress the data upon download to save loading time.
- Can handle basic HTTP authentication (for servers that require a login) and printing download progress bars.
- Easily set up an environment variable to overwrite the data cache location.
Are you a scientist or researcher? Pooch can help you too!
- Automatically download your data files so you don't have to keep them in your GitHub repository.
- Make sure everyone running the code has the same version of the data files (enforced through the SHA256 hashes).
Pooch v0.7.1 was reviewed at the Journal of Open Source Software: https://github.com/openjournals/joss-reviews/issues/1943
Documentation: https://www.fatiando.org/pooch
Source code: https://github.com/fatiando/pooch
Part of the Fatiando a Terra project.
Files
pooch-1.0.0.zip
Files
(215.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d15aae78da5b84eeef8fc0f670ef51af
|
215.1 kB | Preview Download |