Published February 25, 2026 | Version v1
Dataset Open

Supplementary Data for the paper "The distribution of fitness effects of new mutations in regulatory regions of D. melanogaster"

  • 1. ROR icon University of North Carolina at Chapel Hill
  • 2. University of North Carolina

Description

This repository contains the data and code associated with the manuscript:

The distribution of fitness effects of new mutations in regulatory regions of the D. melanogaster genome
Austin Daigle, Jacob Marsh, Andrew Kay, Parul Johri
bioRxiv 2026.03.01.708907; doi: https://doi.org/10.64898/2026.03.01.708907

Repository contents

1.  BED files specifying genomic coordinates (nonoverlapping_annotations.zip) for the annotation classes used in inference:
   - exonic regions
   - phastCons conserved noncoding regions
   - low-confidence regulatory regions
   - high-confidence regulatory regions
   - putatively neutral intergenic regions

2. The final filtered VCF (vcf_for_dfereg_simulansfilter.noRepeats.vcf.gz) used for empirical analyses of the three study populations, together with the final accession list (sample_lists.csv).

3. All code required to reproduce the simulations (simulations.zip) reported in the manuscript, including the mutation and recombination rate maps, along with the simulation outputs analyzed in the manuscript.

---

Contact
For questions about these data and scripts, please contact:

Austin Daigle
University of North Carolina at Chapel Hill
adaigle@unc.edu

Files

nonoverlapping_annotations.zip

Files (4.1 GB)

Name Size Download all
md5:feee4bb70b1dc45f4dc0676e384d5ef4
11.0 MB Preview Download
md5:633900c5a15d4329e03a11ed41ee0418
2.7 kB Preview Download
md5:81348092a9d99cc242961b07934e103d
1.3 GB Preview Download
md5:cd06120c4ccdb6d21ab5c90280888c53
2.7 GB Download