JupyDo: an automated server manager and JupyterHub infrastructure for reproducible bioinformatics

Micocci, Francesco Maria Antonio; Nuvolari, Beatrice; Margaglione, Salvatore; Lentini, Antonio; Borgna, Maurizio; Arigoni, Maddalena; Tao, Jianli; ALESSANDRÌ, LUCA

doi:10.5281/zenodo.20763335

Published June 19, 2026 | Version v1

Poster Open

JupyDo: an automated server manager and JupyterHub infrastructure for reproducible bioinformatics

1. University of Turin
2. Harvard Medical School
3. Boston Children's Hospital

Motivation

Reproducibility in bioinformatics is often compromised by conflicts between software installations, incompatible dependencies, and the lack of standardized computational environments [1].

Methods

JupyDo addresses this challenge by providing an accessible, multi-user infrastructure based on JupyterHub [2], characterized by a guided and fully automated installation process that simplifies deployment. This allows researchers to work in fully independent computational spaces without complex manual configurations. A core innovation of JupyDo is its absolute flexibility: users can start from any custom Docker [3] image. When a custom image is selected, JupyDo automatically builds and adapts it to function seamlessly within the JupyterLab environment. Crucially, this is achieved by deploying a separated, isolated Python installation dedicated exclusively to running the Jupyter infrastructure. This ensures that the original Python environment and dependencies of the base image remain completely untouched. During this automated setup, JupyDo also scans the image for existing Python or R virtual environments (e.g., Conda) and automatically registers them as ready-to-use kernels in JupyterLab. Furthermore, to ensure broad hardware compatibility, JupyDo maintains a dual-mode strategy, supporting deployments across both AMD/x86 and ARM nodes. In addition, JupyDo integrates a dedicated service for the creation and sharing of genomic indices among all users, effectively preventing data duplication and optimizing both storage and computational resources.

Results

JupyDo guarantees that customized environments can be safely preserved, reviewed, and shared. To support dynamic workflows, we implemented a supervised docker commit feature: users can install new tools within their container and submit a commit request, which an administrator can review and approve, ensuring both flexibility and security. Additionally, an integrated "Export Environment" tool streamlines the publication process. With a single action, researchers can export their entire workspace as a tar.gz archive, retrieve the exact Dockerfile, and automatically generate a preliminary draft of the "Materials and Methods" section detailing the software environment. By providing these robust adaptation and export mechanisms, JupyDo offers a scalable, platform-agnostic solution to ensure computational results remain verifiable, transparent, and effortlessly reusable in modern life science research.

Notes

Financial support for event participation was provided by the Open Bioinformatics Foundation (OBF) under the OBF Event Fellowships program (Round 1 2026). Official program details: https://www.open-bio.org/event-awards/

Files

Poster_BITS.pdf

Files (660.9 kB)

Name	Size	Download all
Poster_BITS.pdf md5:abb5735b8ed74666830053f44b1af31e	660.9 kB	Preview Download

Additional details

Repository URL: https://github.com/Vehx35/JupyDo
Programming language: Python , Dockerfile , JavaScript
Development Status: Active

Errington TM, Denis A, Perfito N, Iorns E, Nosek BA. Challenges for assessing replicability in preclinical cancer biology. eLife. 2021;10.
D'Onofrio, A. et al. FairFlow: A Transparency-First Framework for Verifiable and Reproducible Bioinformatics. SSRN Electronic Journal (2026). DOI: 10.2139/ssrn.6339959
Merkel D. Docker: lightweight Linux containers for consistent development and deployment. Linux Journal. 2014;2014(239):2
Di Tommaso P, et al. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017;35:316–319.
Mölder F, et al. Sustainable data analysis with Snakemake. F1000Res. 2021;10:33
Alessandrì L, et al. rCASC: reproducible classification analysis of single-cell sequencing data. GigaScience. 2019;8
Beccuti M, et al. SeqBox: RNAseq/ChIPseq reproducible analysis on a consumer game computer. Bioinformatics. 2018;34:871–872
Jupyter Development Team. JupyterHub: A multi-user server for Jupyter notebooks. https://jupyterhub.readthedocs.io.
Kulkarni N, et al. Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines. BMC Bioinformatics. 2018;19:349

	All versions	This version
Views	7	7
Downloads	3	3
Data volume	2.0 MB	2.0 MB

Poster_BITS.pdf

Files (660.9 kB)

Software

References

JupyDo: an automated server manager and JupyterHub infrastructure for reproducible bioinformatics

Authors/Creators

Description

Notes

Files

Poster_BITS.pdf

Files (660.9 kB)

Additional details

Software

References