There is a newer version of the record available.

Published June 8, 2021 | Version 1.4.0
Software Open

Pooch v1.4.0: A friend to fetch your data files

  • 1. Department of Earth, Ocean and Ecological Sciences, School of Environmental Sciences, University of Liverpool, UK
  • 2. CONICET, Argentina; Instituto Geofísico Sismológico Volponi, Universidad Nacional de San Juan, Argentina
  • 3. New York University, USA
  • 4. Independent (Non-affiliated)
  • 5. Polar Science Center, University of Washington Applied Physics Lab, USA
  • 6. The US National Center for Atmospheric Research, USA
  • 7. University of Illinois at Urbana-Champaign, USA
  • 8. National Center for Supercomputing Applications, Univeristy of Illinois at Urbana-Champaign, USA
  • 9. Institute for Atmospheric and Climate Science, ETH Zurich, Zurich, Switzerland
  • 10. Environmental Physics, ETH Zurich, Zurich, Switzerland
  • 11. EMBL-EBI, UK
  • 12. University of Washington, USA
  • 13. Institut de Planétologie et d'Astrophysique de Grenoble, France

Description

Does your Python package include sample datasets? Are you shipping them with the code? Are they getting too big?

Pooch is here to help! It will manage a data registry by downloading your data files from a server only when needed and storing them locally in a data cache (a folder on your computer).

Here are Pooch's main features:

  • Pure Python and minimal dependencies.
  • Download a file only if necessary (it's not in the data cache or needs to be updated).
  • Verify download integrity through SHA256 hashes (also used to check if a file needs to be updated).
  • Designed to be extended: plug in custom download (FTP, scp, etc) and post-processing (unzip, decompress, rename) functions.
  • Includes utilities to unzip/decompress the data upon download to save loading time.
  • Can handle basic HTTP authentication (for servers that require a login) and printing download progress bars.
  • Easily set up an environment variable to overwrite the data cache location.

Are you a scientist or researcher? Pooch can help you too!

  • Automatically download your data files so you don't have to keep them in your GitHub repository.
  • Make sure everyone running the code has the same version of the data files (enforced through the SHA256 hashes).

Pooch v0.7.1 was reviewed at the Journal of Open Source Software: https://github.com/openjournals/joss-reviews/issues/1943

Documentation: https://www.fatiando.org/pooch

Source code: https://github.com/fatiando/pooch

Part of the Fatiando a Terra project.

Files

pooch-1.4.0.zip

Files (230.4 kB)

Name Size Download all
md5:29a3eaeb3916a2a3d546339cada9b49c
230.4 kB Preview Download