There is a newer version of the record available.

Published April 13, 2020 | Version 1.1.0
Software Open

Pooch v1.1.0: A friend to fetch your data files

  • 1. Department of Earth, Ocean and Ecological Sciences, School of Environmental Sciences, University of Liverpool, UK
  • 2. CONICET, Argentina; Instituto Geofísico Sismológico Volponi, Universidad Nacional de San Juan, Argentina
  • 3. New York University, USA
  • 4. Independent (Non-affiliated)
  • 5. Polar Science Center, University of Washington Applied Physics Lab, USA
  • 6. The US National Center for Atmospheric Research, USA
  • 7. University of Illinois at Urbana-Champaign, USA
  • 8. National Center for Supercomputing Applications, Univeristy of Illinois at Urbana-Champaign, USA

Description

Does your Python package include sample datasets? Are you shipping them with the code? Are they getting too big?

Pooch is here to help! It will manage a data registry by downloading your data files from a server only when needed and storing them locally in a data cache (a folder on your computer).

Here are Pooch's main features:

  • Pure Python and minimal dependencies.
  • Download a file only if necessary (it's not in the data cache or needs to be updated).
  • Verify download integrity through SHA256 hashes (also used to check if a file needs to be updated).
  • Designed to be extended: plug in custom download (FTP, scp, etc) and post-processing (unzip, decompress, rename) functions.
  • Includes utilities to unzip/decompress the data upon download to save loading time.
  • Can handle basic HTTP authentication (for servers that require a login) and printing download progress bars.
  • Easily set up an environment variable to overwrite the data cache location.

Are you a scientist or researcher? Pooch can help you too!

  • Automatically download your data files so you don't have to keep them in your GitHub repository.
  • Make sure everyone running the code has the same version of the data files (enforced through the SHA256 hashes).

Pooch v0.7.1 was reviewed at the Journal of Open Source Software: https://github.com/openjournals/joss-reviews/issues/1943

Documentation: https://www.fatiando.org/pooch

Source code: https://github.com/fatiando/pooch

Part of the Fatiando a Terra project.

Files

pooch-1.1.0.zip

Files (225.8 kB)

Name Size Download all
md5:a56366075113241b8f09e73b34cfe900
225.8 kB Preview Download