Published April 16, 2026 | Version 1.0.0
Computational notebook Open

Data, scripts, code, and supplementary information for "A new iterative framework for simulation-based population genetic inference with improved coverage properties of confidence intervals"

  • 1. EDMO icon Institute of Evolutionary Sciences Montpellier
  • 2. ROR icon Centre de Biologie et de Gestion des Populations
  • 3. ROR icon Université de Montpellier
  • 4. ROR icon Institut Montpelliérain Alexander Grothendieck

Description

Data, scripts, code, and supplementary information for "A new iterative framework for simulation-based population genetic inference with improved coverage properties of confidence intervals".

The Supplnfo.as_on_biorXiv.pdf file includes the Supplementary Information for this paper. The following describes the scripts, code and data included in the zip file.

The Rmarkdown file Infusion.Rmd can be used to generate the Figures and Tables
describing the results of the paper, using information saved in subdirectories.  The Rmarkdown file and its html output can be consulted to match a given subdirectory to a given Table or Figure from the ms.

Raw real data used in the study are provided as
diy2inf_simuls/Harmonia/data_HA_INFUSION.txt and 
diy2inf_simuls/admixtOutOfA/human_snp_all22chr_maf1__10inds_per_sample.snp.
The same directories contain additional files that may be needed to run the simulations, such as 
statobsRF.txt that contains the summary statistics for the data.

Scripts that can be used to reproduce the simulations are provided in directories
toyTests/ and diy2inf_simuls/. These directories themselves contain nested 
subdirectories. Each terminal subdirectory, except the ones named SNLE/, corresponds to a simulation scenario.
Files from parent directories should be copied in a terminal subdirectory in order 
to reproduce simulations for the corresponding simulation scenario. In particular, 
the two generic_workflow.R files are each a master script file whose execution depends 
on the name of the terminal subdirectory where it is copied.

The terminal subdirectories also contain saved summaries of the inferences 
for all simulation replicates for a given simulation scenario, used by Infusion.Rmd.
The diy2inf subdirectory contain definitions of R functions used either by the simulation 
code or by the Rmarkdown script.

To run the Infusion.Rmd script, there are two directories 
which need to be specified in the first R chunks of the Rmarkdown file:

* correctdir : the base directory where the files from this repository are copied. correctdir can easily be changed but the subdirectory architecture of the files from this record should not be modified.
* MSdir : the directory where figures and tables will be written if the Rmarkdown script is run.

Further, the latex2rtf and the ggplot2 R packages should be installed.

Files related to sbi simulations are found in SNLE/ subdirectories for three simulations scenarios. They include the python scripts used to run the simulations, e.g., diy2inf_simuls/admixtOutOfA/N_7from17/SNLE/allfilesforeachjob/snle_N_7from17.py, which can be run in an environment defined by

conda create -n sbi_env python=3.12 && conda activate sbi_env
pip install torch
pip install numpy
pip install sbi==0.25.0

Files

SI_InfusionMS-main.zip

Files (152.2 MB)

Name Size Download all
md5:2996ac8f0fdac219ad2f7eea85826083
151.5 MB Preview Download
md5:c8e730d5281d26d7b041e3dd43cb1bac
658.7 kB Preview Download

Additional details

Related works

Is supplement to
Publication: 10.24072/pci.mcb.100426 (DOI)

Dates

Issued
2026-04-16
Original Zenodo release