Genome-wide gene expression noise in Escherichia coli is condition-dependent and determined by propagation of noise through the regulatory network
Authors/Creators
- 1. Biozentrum, University of Basel and Swiss Institute of Bioinformatics
Description
In this repository we provide raw and processed datasets for the article: “Genome-wide gene expression noise in Escherichia coli is condition-dependent and determined by propagation of noise through the regulatory network” by Arantxa Urchueguía, Luca Galbusera, Dany Chauvin, Gwendoline Bellement, Thomas Julou and Erik van Nimwegen.
A preprint is available under the following DOI: https://doi.org/10.1101/795369.
The repository consists of the following datasets:
1. preprocessed_datasets.zip(~22GB)
- This dataset contains raw data from the flow cytometry experiments (FACS Canto II, BD Bioscience) in all measured conditions in RData format. Raw fcs files were processed with the tools described in the publication ''Using fluorescence flow cytometry data for single-cell gene expression analysis in bacteria" published here: https://doi.org/10.1371/journal.pone.0240233. The tools themselves are available here: https://github.com/vanNimwegenLab/E-Flow. Included in the files are the outputs of these processing tools together with all raw values that came directly from the flow cytometer. The file directory_structure_in_preprocessed contains information about how the files are organized.
2. info_files: This is a set of csv files containing detailed information about the experiments done to acquire the preprocessed_datasets as well as annotation files that we used to retrieve promoter information.
3. processed_datasets: These files correspond to the processed datasets from the raw Rdata files under 1 above. The processed data provide mean and variance estimates in fluorescence of E.coli promoters across the different growth conditions. Note that we discarded flow cytometry measurements from promoter/growth-condition combinations that contained abnormal fluorescence distributions (due to contamination) as well as measurements from reporters with annotation mismatches. The folder contains the following clean dataset files that were used in the paper:
- FULL_dataset_mean_var_wreplicates: In this dataset we include the processed means and variances (in both logarithmic and linear scale) of all promoters in each condition. Included as well are replicate measurements for some conditions.. We also include the name and Blattner number of the gene immediately downstream of each promoter, the DNA sequence of each promoter, and regulatory information (number of unique inputs for transcription factors sites and their names) which we obtained from RegulonDB v 10.5 (https://doi.org/10.1093/nar/gky1077).
- dataset_with_noise_estimates: In this dataset we provide noise estimates for all promoters expressed above an expression threshold (mean GFP fluorescence at least as large as autofluorescence). Note that the noise estimate correspond to the difference between the promoter’s variance in log-expression and the minimal variance as a function of its mean expression (i.e. the so called noise floor was subtracted). Apart from the mean, variance, noise and promoter features (sequence, name of gene downstream, number of unique regulatory inputs and name of the TFs binding), we also include the parameters used for fitting the minimal noise, i.e. noise floor, in each of the conditions.
- time_course_data_SI: This dataset contains mean and variance measurements of one of the plates of the library measured at different time points during growth in Minimal media 0.4M NaCl: 0h (just after dilution), 1h, 2h, 3h, 5h, 6.5h, 8.5h, 10h and 11h.
- growth_curves_SI: Growth data (OD600 as a function of time) for a subset of the promoters from the library across different growth conditions.
- singlecell_areas_SI: Single-cell areas estimated using agar patches of cells growing in each condition. Each row of the table contains data for a single-cell.
- synthetic_promoters_dataset: This dataset contains mean, variance and noise measurements of a set of constitutive promoters from https://doi.org/10.7554/eLife.05856.001 across different conditions.
- MARA_results: All transcription factor activities results explaining measured noise levels in each condition. This data has been obtained after performing Motif Activity Response Analysis on the noise levels of all measured promoters in each condition.
Notes
Files
info_files_v2.zip
Files
(22.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c2c994d4f20ce3bfc05ad690966e7136
|
1.4 kB | Download |
|
md5:3d0fa289d0f1bd741ef10be3708239ce
|
296.9 kB | Preview Download |
|
md5:d416f9139e485abab0b9a258ae1ebfcd
|
677.1 MB | Preview Download |
|
md5:ac5434223b69b2cc5220f9796134d1ce
|
22.0 GB | Preview Download |
|
md5:e1e4f604526e1aff01ab3ddb9c119dcb
|
3.2 MB | Preview Download |
Additional details
Funding
- Swiss National Science Foundation
- The role of gene expression noise in the evolution of gene regulation 31003A_159673