Published July 9, 2020 | Version v1
Presentation Open

Sharing Reproducible Python Environments with Binder

Authors/Creators

  • 1. The Alan Turing Institute

Description

30 minute talk on Project Binder and mybinder.org at the EuroPython 2020 conference.

Abstract (long)

As reproducibility gains traction in the data science and research communities, the need to package code, data and the computational environment is growing.

There are many tools that address different aspects of this type of packaging, such as Jupyter Notebooks for literate programming, Docker for containerising and porting computational environments, and so on. But they represent barriers to reproducibility as each one requires time and effort to learn.

Project Binder integrates Notebooks and Docker for generating reproducible computational analyses and combines them with a web-based interface and cloud orchestration engines. This means that analysts do not have to worry about all the moving parts so long as they have followed basic software best practices: their code is version controlled and they've captured the dependencies the analysis needs to run. Binder then hosts the compute in the cloud and makes it easily shareable by providing a unique URL to the code repository, without imposing additional overheads on the analyst.

During this talk, Sarah will introduce Binder (the service), BinderHub (the technological infrastructure) and mybinder.org (a public instance of a Binder service, free for anyone to use) and demonstrate how it can be used to share Python environments and analyses.

Abstract (short)

Packaging code, data and computational environments to provide reproducible analyses requires extra time and effort to learn the required tools. Project Binder provides open-source tooling for executing reproducible analyses in the cloud, and enabling them to be shared via a single, unique URL.

In this talk, Sarah will introduce Binder, BinderHub and mybinder.org and demonstrate how it can be used to share Python environments and analyses.

Prerequisites for attending the session

Familiarity with version control, what a Jupyter Notebook is and an interest in reproducibility.

Python Skill Level

Beginner

Domain Expertise

Beginner

Tags

  • Best Practice
  • Data Science
  • Jupyter
  • Open-Source
  • Public Cloud (AWS/Google/...)

Files

SGibson_EuroPython2020_Binder.pdf

Files (72.7 MB)

Name Size Download all
md5:8923f8e798bd6622ff9c93037ace1ddd
48.5 MB Preview Download
md5:def98893a5e4a59133c873ad139a52c1
24.2 MB Download