Dataset Open Access

California condor genome data and resources

Jacqueline A. Robinson; Rauri C. K. Bowie; Olga Dudchenko; Erez Lieberman Aiden; Sher L. Hendrickson; Cynthia C. Steiner; Oliver A. Ryder; David P. Mindell; Jeffrey D. Wall

Genome assembly for the California condor, genotype data for two California condors and two related species (Andean condor and turkey vulture), and supporting files.

See "Genome-wide diversity in the California condor tracks its prehistoric abundance and decline," by Robinson et al. (2021) for full details. Also see https://10.5281/zenodo.4680034 for processing and analysis code.

Samples:
CRW1112 California condor (Studbook #593)
CYW1141 California condor (Studbook #309)
VulGry1 Andean condor (ISIS 417)
BGI_N323 Turkey vulture (SAMN02319050, from https://doi.org/10.1126/science.1251385)

Files:
gc_PacBio_HiC.fasta California condor genome sequence.

gc_PacBio_HiC_scaffold_chr_key.txt Key giving the chromosomal identity of scaffolds in the California condor genome assembly (where known).

gc_PacBio_HiC_repeats*.bed Coordinates of repeats in the California condor genome identified with Tandem Repeats Finder (TRF, https://tandem.bu.edu/trf/trf.html) and WindowMasker (WM, https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/winmasker).

*.vcf.gz, *.vcf.gz.tbi Raw VCF files plus indexes for each sample aligned to reference gc_PacBio_HiC.fasta. 

*cpgIslands*.bed Coordinates of CpG islands in gc_PacBio_HiC.fasta. Coordinates including and excluding CpG islands in repeats are provided.

*.over.chain.gz, *.rbest.chain.gz Chain files for liftOver (https://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64). Named as FROM.TO.TYPE.chain.gz. The "rbest" chains represent the reciprocal best alignments between both genomes. ASM69994v1 is the turkey vulture genome assembly, galGal6 is the chicken genome assembly.

ismc_CYW1141.rho.*.bed Bed files containing coordinate ranges and rho/bp inferred with iSMC (https://github.com/gvbarroso/iSMC) using California condor #309. Intervals of 1 kb and 1 Mb are provided.

*.psmc, *.msmc, *.msmc2 Output files from PSMC (https://github.com/lh3/psmc), MSMC (https://github.com/stschiff/msmc), and MSMC2 (https://github.com/stschiff/msmc2).

ROH*.bed Coordinates of runs of homozygosity (ROH) >=1 Mb in each California condor sample, identified with Plink (v1.9, https://www.cog-genomics.org/plink/).

Files (34.4 GB)
Name Size
ASM69994v1.gc_PacBio_HiC.rbest.chain.gz
md5:e8731bac7020fd71f06f89c5c26212fa
9.5 MB Download
BGI_N323.msmc
md5:8471e8141d0d8138b0b4bd8854fab335
2.2 kB Download
BGI_N323.msmc2
md5:9a1e88b339f7c6d1f8744b5ae5a4b50a
2.2 kB Download
BGI_N323.psmc
md5:e146095405aa6f3ed6da4df0a87135cb
103.7 kB Download
BGI_N323.vcf.gz
md5:25e47e1935b89229215dd6516089e7ab
8.6 GB Download
BGI_N323.vcf.gz.tbi
md5:851bf99c9fa61970d68eb9c6c03a8056
1.1 MB Download
CRW1112.msmc
md5:8016a5da4d4db3acd930c9dc18255c39
2.2 kB Download
CRW1112.msmc2
md5:ccaaf5264c37c064e2bb9d8f23a15d45
2.2 kB Download
CRW1112.psmc
md5:b36bba751d06a321c12df5763bc3ccd1
104.4 kB Download
CRW1112.vcf.gz
md5:c2691844f540197ec7bd46d448d32659
7.4 GB Download
CRW1112.vcf.gz.tbi
md5:2a13415d67228aaeffdacae1d849ccaa
1.1 MB Download
CYW1141.msmc
md5:dd063bee2c2f48e78e860bbed336bac2
2.2 kB Download
CYW1141.msmc2
md5:2074f3eb836556fd7b2bdfcb8db25f4e
2.2 kB Download
CYW1141.psmc
md5:c32502a711215a3ab4f7458066f27460
104.4 kB Download
CYW1141.vcf.gz
md5:da25ef33f00c5fb21982eff322e413b3
7.4 GB Download
CYW1141.vcf.gz.tbi
md5:f8b943ce4c840a0e86cce308535842e5
1.1 MB Download
galGal6.gc_PacBio_HiC.rbest.chain.gz
md5:acff982500eb263650cc245f4ac44968
41.5 MB Download
gc_PacBio_HiC.ASM69994v1.over.chain.gz
md5:041608efebba9ffa7592293edf008392
14.7 MB Download
gc_PacBio_HiC.ASM69994v1.rbest.chain.gz
md5:91e7c43d3333b5027e118833be70a93b
9.5 MB Download
gc_PacBio_HiC.fasta
md5:97f1e53686021b19ca03f03ebe48a55e
1.3 GB Download
gc_PacBio_HiC.galGal6.over.chain.gz
md5:8e3afc956811175b2de25b8783b2d74b
48.6 MB Download
gc_PacBio_HiC.galGal6.rbest.chain.gz
md5:5379212170c03e0e96232a85fca3a367
41.7 MB Download
gc_PacBio_HiC_cpgIslands.bed
md5:87d35f5324e31f6729417bcf76e1b317
1.1 MB Download
gc_PacBio_HiC_cpgIslands_nonrepeats.bed
md5:f2dd85c3069d8c99cf93228d34f4d135
1.0 MB Download
gc_PacBio_HiC_repeats_TRF.bed
md5:ef270dde29023715ff81e34e7cd8941d
1.6 MB Download
gc_PacBio_HiC_repeats_WMdust.bed
md5:b45cae1b348939f6f5bf6a5071c64c46
236.4 MB Download
gc_PacBio_HiC_scaffold_chr_key.txt
md5:8ebc387af95e3c6efdeef518762f7f3d
641 Bytes Download
ismc_CYW1141.rho.1kb.bed
md5:a35b187202af58ca04ba5c25d7b3b65e
58.8 MB Download
ismc_CYW1141.rho.1Mb.bed
md5:05f2da21e05cf2d382e1df3b06584f8f
61.6 kB Download
ROH_CRW1112.bed
md5:20521fc21c30ae8d99e8fe8cb0b38418
3.9 kB Download
ROH_CYW1141.bed
md5:ff625b3f2ee8222e5e4cf9ee4c2cd617
4.1 kB Download
VulGry1.msmc
md5:e850c75675998ce20018fdf398e613b2
1.8 kB Download
VulGry1.msmc2
md5:566298dd456ec673a8e19bc1b09b4fad
1.7 kB Download
VulGry1.psmc
md5:d5fab0efc9363b6a19519c3c27cc285c
82.1 kB Download
VulGry1.vcf.gz
md5:3fc93a8e5a3ee43b053e88aa655307f0
9.4 GB Download
VulGry1.vcf.gz.tbi
md5:fc423758e46cf865b39a3ba7b2ad6197
1.1 MB Download
55
46
views
downloads
All versions This version
Views 5555
Downloads 4646
Data volume 62.7 GB62.7 GB
Unique views 5252
Unique downloads 2727

Share

Cite as