Published March 27, 2022
| Version v1
Dataset
Open
Reconstruction of full-length LINE-1 progenitors from ancestral genomes (Supplementary Data)
Creators
- 1. University of Toronto
- 2. University of British Columbia
- 3. McGill University
Description
Web Supplementary Files
- Web Supplementary File 1 - FASTA files containing full-length reconstruction input sequences: full_length_reconstruction_input_sequence_fastas.zip
- Web Supplementary File 2 - FASTA files containing Muscle alignments of the full-length reconstruction input sequences. full_length_reconstruction_input_sequence_alns.zip
- Web Supplementary File 3 - FASTA file of full-length reconstructed sequences: full_length_reconstructions.fa
- Web Supplementary File 4 - Table of full-length reconstruction statistics: full_length_reconstruction_stats.csv
- Web Supplementary File 5 - FASTA files containing ORF reconstruction input sequences: orf_fastas.zip
- Web Supplementary File 6 - FASTA files containing Macse alignments of the ORF reconstruction input sequences: ORF_reconstruction_input_sequence_alns.zip
- Web Supplementary File 7 - Table of ORF reconstruction statistics: ORF_reconstructions.fa
- Web Supplementary File 8 - Table of ORF reconstruction statistics: ORF_reconstruction_stats.csv
- Web Supplementary File 9 - Table of Composite Sequences: bestfl_selection_fixed_CS_seqs.csv
- Web Supplementary File 10 - Database of gold standards: L1_goldstandards.csv
Data Underlying Figures
- RepeatMasker scans of hg38 and ancestral genomes: anc_gen_RM_out_files.zip
- Figure 4
- 4A
- Source alignment of 54 composite sequences: 220121_dropped12+L1ME3A_muscle.nt.afa
- Tree produced using the alignment and FastTree: 220121_dropped12+L1ME3A.tree
- 4B
- Source alignment of 67 Dfam L1 subfamily 3’ end models: 200123_dfam_3ends.fa.muscle.aln
- Tree produced using the alignment: 200123_dfam_3ends.fa.muscle.aln.tree
- 4A
- Figure 5
- KZFP-TE enrichment p-values (from Barazandeh et al 2018): TE_KZFP_enrichment_pvals.xlsx
- KZFP-TE top 500 peak overlap (from Barazandeh et al 2018): top500_peak_overlap.xlsx
- Figure 6
- RepeatMasker .out file for the Composite Sequence custom library queried against hg38: CS_RM_hg38.fa.out.gz
- Figure S2
- RepeatMasker scan .out file of hg38 (CG corrected Kimura Divergence values are in last column): hg38+KimDiv_RM.out
- RepeatMasker scan .out file of the Progressive Cactus eutherian ancestral genome (CG corrected Kimura Divergence values are in last column): Progressive_Cactus_Euth+KimDiv_RM.out
- RepeatMasker scan .out file of the Ancestors 1.1 eutherian ancestral genome (CG corrected Kimura Divergence values are in last column): Ancestors_Euth+KimDiv_RM.out
- Figure S5
- RepeatMasker scan .out files for Progressive Cactus simian and primate reconstructed ancestral genomes: progCactus_RM_outfiles.zip
- S5A
- FASTA files containing Cactus genome-derived reconstructed sequences equivalent to the L1MA2, L1MA4, and L1MD1-3 best full-length sequences: progCactus_reconstruction_bestFL_equivalents.zip
- S5B
- FASTA files containing Muscle alignments of Cactus genome-derived full-length reconstruction input sequences: progCactus_reconstruction_input_sequence_alns.zip
- Figure S6
- S6A
- Results of Conserved Domain scans of Cactus genome-derived full-length reconstructed sequences: CD_search_results_short_nms.txt
- S6B-D
- Character posterior probabilities of “best” full-length reconstructed sequences: best_fl_post_probs.zip
- S6A
- Figure S7
- S7B-C
- Results of Conserved Domain scans of translated initial full-length reconstructed sequences: initial_recons_all_3frametrans_CD-search.txt
- Results of Conserved Domain scans of translated reconstructed ORFs: recons_ORF1-2_all_3frametrans_CD-search.csv
- S7B-C
- Figure S15
- S15A
- Source alignment of 67 composite sequences: bestfl_selection_fixed_CS_seqs_muscle.nt.afa
- Tree produced using the alignment: bestfl_selection_fixed_CS_seqs_muscle.nt.afa.tree
- S15B-E
- Source Muscle alignments for phylogenetic trees of reconstructed sequence components:
- ORF2: ORF2_keep54_muscle.nt.afa
- 5’ UTR: 5utr_keep54_muscle.nt.afa
- ORF1: ORF1_keep54_muscle.nt.afa
- 3’ UTR: 3utr_keep54_muscle.nt.afa
- Trees produced using above alignments:
- ORF2: ORF2_keep54_muscle.nt.afa.tree
- 5’ UTR: 5utr_keep54_muscle.nt.afa.tree
- ORF1: ORF1_keep54_muscle.nt.afa.tree
- 3’ UTR: 3utr_keep54_muscle.nt.afa.tree
- Source Muscle alignments for phylogenetic trees of reconstructed sequence components:
- S15A
- Figure S17
- Unfiltered BLAST results of Composite Sequences queried against hg38: CS_hg38_blastn.csv.zip
- BED file of L1 instances annotated using BLAST pipeline: BLAST_L1_hits.bed
Files
anc_gen_RM_out_files.zip
Files
(4.2 GB)
Name | Size | Download all |
---|---|---|
md5:b7eafc3b0bf562960f6cd487e1eadcb2
|
235.1 kB | Download |
md5:54566671a3688f25df92d9acecc34d68
|
1.9 kB | Download |
md5:fb439c8deaca93cab852018a672fa31d
|
6.9 kB | Download |
md5:f913dad06dbc30b87e797035f8e32808
|
959.2 kB | Download |
md5:9ca11321617b513902c6ee0c79a4385c
|
111.9 kB | Download |
md5:9e0d7ee7c0e860550fe6682ad2b6516f
|
5.9 kB | Download |
md5:8b4365244150de4682b311db5d258fbc
|
212.4 kB | Download |
md5:eb3de0fa1cb41a7438d45254e39e3db0
|
1.3 kB | Download |
md5:a6cb5eaee56eedf818221c385183353d
|
1.9 GB | Preview Download |
md5:3906ceede6d3ef965fa4ae1cb3eafdce
|
286.6 MB | Download |
md5:ddc026915ade726f2437a95bf7697060
|
5.3 MB | Preview Download |
md5:1ff3419b4247360aaa02fd6afaa02501
|
455.8 kB | Preview Download |
md5:657c821fbf7315b2f9255020d58154a8
|
1.3 MB | Download |
md5:366ed5aaf3ba54ad756db18e6b29c8a2
|
1.9 kB | Download |
md5:bd736d9fd3b35aed189af653b7caec15
|
33.4 MB | Download |
md5:93e5eb2d00a52ca8b96efd30dc3a15b4
|
2.0 MB | Preview Download |
md5:251342c950c637a4115979dd104f7c87
|
421.9 MB | Preview Download |
md5:9b8df606d3a73cf44fe6d26678247c65
|
138.5 MB | Download |
md5:540d2bd346cf828ec6b209fec06933bc
|
60.6 MB | Preview Download |
md5:9b30cc9937f0973c519ea8279c74c56a
|
41.9 MB | Preview Download |
md5:540ad802342f851dcc90c5090da78898
|
511.0 kB | Preview Download |
md5:44d60e1591affde0abafe6d8eb9b86b0
|
10.1 MB | Download |
md5:881c90838c31c8b8752ba19a130bf1fa
|
735.4 MB | Download |
md5:cb2b155e12130e3735efac4e815f7acd
|
8.4 MB | Preview Download |
md5:e7d0e5f1b642bccf3739dafd217becd6
|
647.3 kB | Preview Download |
md5:370866528740d33b210943c170bdb450
|
72.0 kB | Download |
md5:ef80f80db13bb5c284bede75fc5f4e51
|
1.5 kB | Download |
md5:c3b46ac531b7e12a585e4b11e598c3d9
|
426.6 kB | Download |
md5:86166de0c44df29be7aaba7c40ea7fed
|
1.5 kB | Download |
md5:dba75683771828e2fe1d30be1e3bd37d
|
25.5 MB | Preview Download |
md5:3aeec7fab606b026cd0027907a06425a
|
37.7 MB | Preview Download |
md5:785694e92c30f7023e748409c36d7a6b
|
597.2 kB | Preview Download |
md5:3eb5ea80eae450d2f2337f6e84fdf281
|
6.6 MB | Download |
md5:a94b2d007079e5192b56e451e64c1ad3
|
11.1 kB | Preview Download |
md5:717db0e76e09faaab91d446acff55751
|
546.2 kB | Preview Download |
md5:30320abc3b157d115fc0d20e7c4a0a5c
|
262.4 MB | Preview Download |
md5:01c8f6de60618c249939c74cf91fcad2
|
247.0 MB | Download |
md5:cac4abd5764d62875e01bde10d7bc7de
|
41.7 MB | Preview Download |
md5:2afcf156f035a7e5c832f40c25293cb1
|
1.1 MB | Download |
md5:4e998a967267cd57312b38244afe8dfe
|
841.9 kB | Download |
Additional details
Related works
- Is supplement to
- Journal article: 10.1093/genetics/iyac074 (DOI)