Massively multiplex single-molecule oligonucleosome footprinting
Creators
- 1. Department of Biochemistry & Biophysics, University of California San Francisco, San Francisco CA
- 2. Department of Pediatrics, Stanford University, Palo Alto, CA
- 3. Vector Institute, University of Toronto, Toronto, Canada
- 4. Pacific Biosciences of California Inc, Menlo Park, CA
Description
These are the intermediate data used in "Massively multiplex single-molecule oligonucleosome footprinting", where the nonspecific adenine methyltransferase EcoGII was used to footprint accessible chromatin, and the methylation was then read using the Pacific Biosciences sequencing platform. The files here are intermediate outputs that capture metrics about the inter-pulse distance values as well as predictions of methylation status.
The .npy, .feather, and .pickle files are the output of extractIPD.py, and callNucPeaks.py. The .csv is the output of a cell in SAMOSA_analyses.ipynb. All of this code can be found at https://github.com/RamaniLab/SAMOSA, including SAMOSA_analyses.ipynb which contains all downstream analyses that were performed on this data.
meanIPDinfoChrControls.csv: This contains the data used to generate Supplementary Figures 2 and 3. It contains various summary measurements of the IPD values in each molecule of the in vitro samples
Files ending in _bingmm.npy: These contain the posterior probability of adenines being methylated for the in vitro data. The files beginning with pbrun3 were sequenced on the Sequel I, and the files beginning with pbrun4 or pbrun5 were sequenced on the Sequel II. Other than Supplementary Figures 2 and 3, the analysis in the paper was based on the Sequel II data. naked_neg and DNA_minusM are both negative controls. naked_methyl and DNA_plusM are positive controls. chromatin samples are in vitro assembled chromatin.
pbrun4_gold_nuc47_chromatin_peaks.feather: The estimated nucleosome centers from the in vitro assembled chromatin, in a data frame. Each row is an individual nucleosome dyad prediction.
Files ending in _onlyT_zmwinfo.pickle: These files each contain a pandas data frame containing information about each molecule sequenced in the in vivo samples. These should be read in in python using the pandas read_pickle function, and require the same namespace as was used when saving them, so pandas must be imported as pd, and numpy as np. The neg samples are deproteinated unmethylated molecules, the pos samples are deproteinated methylated molecules, and the chromatin samples are methylated chromatin.
Files ending in _bingmm.pickle: These files contain the posterior probability of being methylated for each adenine in each DNA molecule in the sample. They each have a corresponding zmwinfo file described above, and similarly must be read in with numpy imported as np. Each file is a dictionary with the zero-mode waveguide (ZMW) hole number as a key, and the value a numpy array with length equal to the unaligned CCS of that molecule, with methylation posterior probabilities at each A/T base. The zmwinfo dataframe has a 'zmw' column that can be used to match the information in that file with the methylation information in this one.
Files
meanIPDinfoChrControls.csv
Files
(13.5 GB)
Name | Size | Download all |
---|---|---|
md5:1747a4e48bad7570b4ca2294af77d4da
|
26.4 MB | Preview Download |
md5:7ede279d8cb9d3f353683872386a8ab4
|
116.8 MB | Download |
md5:704a1df683d86faf8a0f407b1215e6bf
|
108.8 MB | Download |
md5:e8220dd1080def92c03c55bea6d1c2af
|
48.9 MB | Download |
md5:45cd3b8612db50ca8607f19f9d177c0c
|
86.9 MB | Download |
md5:bc027385cb507f2f1c604d2e29aaba43
|
1.7 MB | Download |
md5:1392d54fe3dd8a154b073114268db8fb
|
668.6 MB | Download |
md5:fe8d2885afacd8446c2d0ec5b47b2ff7
|
39.2 MB | Download |
md5:f323f97373c8ac8d0ba30b48c8ebd10c
|
656.6 MB | Download |
md5:c8ea8eeb555ff5249cdd64755c0d92ab
|
36.1 MB | Download |
md5:14c07d067b39fde09af8db4ad69000ab
|
845.7 MB | Download |
md5:e22936e29bdcc774053ecba0693c5b3f
|
51.3 MB | Download |
md5:4decb46286427cf0c1400359287f335a
|
618.7 MB | Download |
md5:b0fb4afd5ae2e0d6fbc8ad00b693e0df
|
35.4 MB | Download |
md5:88922b07b0c2fca88665b06199a2c5f8
|
639.8 MB | Download |
md5:45f89895bcbb2705713ef942e942d3cb
|
37.5 MB | Download |
md5:7ab41844a64030aa21f305e9058d2421
|
1.1 GB | Download |
md5:5b4d2fe3dce5e2816886f970642a944c
|
70.0 MB | Download |
md5:49cfc92ae9379ce2d6fc3c141c7ff8b7
|
105.7 MB | Download |
md5:437957e215a8911c3297c9296aa9188b
|
142.5 MB | Download |
md5:b3a76466045c613c6f01b86f063254ae
|
3.3 GB | Download |
md5:7b24a4d0a166addb78c3e1e33cab633c
|
110.5 MB | Download |
md5:41322268b992bd5b4f826f3fbb87dba8
|
3.8 GB | Download |
md5:2630fb0ed7e4bcee7d740e1838268785
|
115.4 MB | Download |
md5:ba85cf54c360894c51ba6e777736818b
|
24.7 MB | Download |
md5:2f7dea95613f26206bf872e738044acd
|
545.0 kB | Download |
md5:cae1944caf857e388240f7b9765d97b5
|
18.2 MB | Download |
md5:a14c10ce88b9f723b8bf6303b8cf621b
|
208.0 kB | Download |
md5:28c17d1101d29e0d674f10973cf87eca
|
22.2 MB | Download |
md5:220ad6025e5ec518c4ab2a84143b5887
|
227.2 kB | Download |
md5:3ee528bf090d960e134db3d25596b88c
|
34.2 MB | Download |
md5:751d9e59cceaf673cb4251280ce7faa8
|
927.4 kB | Download |
md5:c8a665eb9c2557b6e41dfd5fc642c1fe
|
134.9 MB | Download |
md5:06cbc090874059bfd7f9434426415052
|
3.2 MB | Download |
md5:fb4ccfb01d0e215893b5e121f8f6bd37
|
88.3 MB | Download |
md5:4d9dea8d1d478db12a8be5dfb9c52642
|
1.0 MB | Download |
md5:9573dfb6b0069102eda6ed6e0ac18c0e
|
112.1 MB | Download |
md5:bd2c95f649cdb4627f60884ad1862249
|
1.2 MB | Download |
md5:6bf5a51fff5f562fae09033910ea872e
|
231.6 MB | Download |
md5:db3429da387daf045044e5ec4465b951
|
6.8 MB | Download |