Published March 13, 2021 | Version v1
Dataset Open

Conserved long-range base pairings are associated with pre-mRNA processing of human genes

  • 1. Skolkovo Institute of Science and Technology, Moscow 143026, Russia
  • 2. Faculty of Chemistry, Moscow State University, Moscow 119234, Russia
  • 3. Center for Genomic Regulation and UPF, Barcelona 08003, Spain

Description

1. SupplementaryDataFile1.bed
The full list of PCCRs, GRCh37 Human Genome assembly. The list is provided in BED12+ format, where columns 1-12 correspond to the track hub, and columns 13-28 contain extra information.

        13: PCCR id
        14: CCR1
        15: CCR2
        16: structure in dot-bracket notation
        17: icSHAPE_delta_reactivity score
        18: ENSEMBL gene id
        19: NCBI gene name
        20: Presence of A-to-I editing sites (TRUE/FALSE)
        21: Presence of forked eCLIP peaks in both CCRs
        22: phastCons.score1
        23: phastCons.score2
        24: phastCons.score3
        25: Evidence from RIC-seq data
        26: Raw E-value (the product of R-scape E-values of all base pairs in the structure)
        27: E-value (adjusted with Benjamini-Hochberg correction)
        28: Free energy

2. SupplementaryDataFile2.bed
The full list of PCCRs, GRCh38 Human Genome assembly. The columns are as in SupplementaryDataFile1.bed.

3. SupplementaryDataFile3.tsv
RNA bridges, GRCh37 Human Genome assembly. The columns are as follows
        1: PCCR coordinates
        2: eClip peak coordinates
        3: exon coordinates
        4: RBP name
        5: NCBI gene name
        6: Change of the exon inclusion rate in RBP KD (delta PSI)
        7: PCCR id
        8: PCCR energy
        9: PCCR spread
        10: Difference of icSHAPE reactivity of CCR
        11: E-value
        12: number of PCCRs in the cluster

4. SupplementaryDataFile4.tsv
Exon loop-outs, GRCh37 Human Genome assembly. The columns are as in SupplementaryDataFile3.tsv.

5. SupplementaryDataFile5.bed
A stringent set of intramolecular RIC-seq RNA contacts (provided by a courtesy of Prof. Xue, PMID:32499643). The contact are between part A and part B. The columns are as follows

        1: chrA, chromosome of part A
        2: startA, start of part A
        3: endA, end of part A
        4: chrB, chromosome of part B
        5: startB, start of part B
        6: endB, end of part B
        7: Cluster ID
        8: Number Of Chimeric Reads

Notes

Any questions or comments should be addressed to Dmitri Pervouchine d.pervouchine@skoltech.ru

Files

Files (457.8 MB)

Name Size Download all
md5:2ee60edbc7c4ebee1ff5debbaf810db5
220.4 MB Download
md5:95007c3733c87f9152122c989fb60a50
202.1 MB Download
md5:29b418f7f670da3d95c05fac039587a7
37.1 kB Download
md5:aea6b00320f3050ce3dae1f849c4ba05
145.5 kB Download
md5:b90a3809c1c3fc7eb448a53efa991c44
35.2 MB Download

Additional details

Related works

Is supplement to
Preprint: 10.1101/2020.05.05.076927 (DOI)