Dataset Open Access

Supplemental dataset for Northern Spotted Owl (<i>Strix occidentalis caurina</i>) genome assembly version 1.0

Hanna, Zachary R.; Henderson, James B.; Wall, Jeffrey D.; Emerling, Christopher A.; Fuchs, Jérôme; Runckel, Charles; Mindell, David P.; Bowie, Rauri C. K.; DeRisi, Joseph L.; Dumbacher, John P.

StrOccCau_1.0_nuc_masked.fa.bz2 : This FASTA format file compressed with bzip2 is the file that you will most likely want to download if you would like to perform an alignment to this genome assembly. This file is the repeat-masked assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt and also without the contigs and scaffolds that we identified either as the mitochondrial genome sequence or as contaminant sequences.

StrOccCau_1.0_nuc.fa.bz2 : This FASTA format file compressed with bzip2 is the file that we deposited at DDBJ/ENA/GenBank as a Whole Genome Shotgun (WGS) project under accession NIFN00000000. This file is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt and also without the contigs and scaffolds that we identified either as the mitochondrial genome sequence or as contaminant sequences.

StrOccCau_1.0_mito.fa : This FASTA format file is the mitochondrial-genome-derived scaffold from the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012).

StrOccCau_1.0.gff.bz2 : This gff format file compressed this file with bzip2 contains the gene annotations of StrOccCau_1.0_nuc.fa.

StrOccCau_1.0_transcripts.fa.bz2 : This FASTA format file compressed this file with bzip2 contains the sequences of the gene transcript sequences of the genes annotated in StrOccCau_1.0.gff.

StrOccCau_1.0_proteins.fa.bz2 : This FASTA format file compressed this file with bzip2 contains the protein sequences of the genes annotated in StrOccCau_1.0.gff.

StrOccCau_1.0_RM_homology_includes_LowComplexity.out.bz2 : This file provides the repeat annotations produced by the homology-based masking of StrOccCau_1.0_nuc.fa that included masking of low complexity regions and simple repeats. We compressed this file with bzip2.

StrOccCau_1.0_RM_DeNovo_includes_LowComplexity.out : This file provides the repeat annotations produced by the de novo masking (which followed after first performing homology-based masking) of StrOccCau_1.0_nuc.fa that included masking of low complexity regions and simple repeats.

StrOccCau_1.0_RM_homology_no_LowComplexity.out.bz2 : This file provides the repeat annotations produced by the homology-based masking of StrOccCau_1.0_nuc.fa that did not include masking of low complexity regions and simple repeats. We compressed this file with bzip2.

StrOccCau_1.0_RM_DeNovo_no_LowComplexity.out : This file provides the repeat annotations produced by the de novo masking (which followed after first performing homology-based masking) of StrOccCau_1.0_nuc.fa that did not include masking of low complexity regions and simple repeats.

StrOccCau_1.0_alignments_of_light_associated_genes.txt : This file provides alignments of light-associated gene orthologs as well as assemblies of transcriptome sequences in NEXUS format.

StrOccCau_1.0_nuc_masked_SpottedBarredOwl_variant_file.vcf.bz2 : This is a raw, unfiltered variant call format file compressed with bzip2 that was generated after aligning both spotted owl and barred owl short read data aligned to StrOccCau_1.0_nuc_masked.fa.

StrOccCau_0.1.fa.bz2 : This FASTA format file compressed with bzip2 is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012).

StrOccCau_0.1_masked.fa.bz2 : This FASTA format file compressed with bzip2 is the repeat-masked assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012).

StrOccCau_0.2.fa.bz2 : This FASTA format file compressed with bzip2 is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt.

StrOccCau_0.2_masked.fa.bz2 : This FASTA format file compressed with bzip2 is the repeat-masked assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt.

StrOccCau_GapCloser_output_NoContamNoMito.fa.bz2 : This FASTA format file compressed with bzip2 is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without the contigs and scaffolds that we later identified either as the mitochondrial genome sequence or as contaminant sequences.

References

Luo R., Liu B., Xie Y., Li Z., Huang W., Yuan J., He G., Chen Y., Pan Q., Liu Y., Tang J., Wu G., Zhang H., Shi Y., Liu Y., Yu C., Wang B., Lu Y., Han C., Cheung DW., Yiu S-M., Peng S., Xiaoqian Z., Liu G., Liao X., Li Y., Yang H., Wang J., Lam T-W., Wang J. 2012. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1:18. DOI: 10.1186/2047-217X-1-18.

Files (7.9 GB)
Name Size
StrOccCau_0.1.fa.bz2 md5:7dde081c927bd49276eea07199974670 467.6 MB Download
StrOccCau_0.1_masked.fa.bz2 md5:24aded5ffcce3f87cb996c2c65b603f5 417.4 MB Download
StrOccCau_0.2.fa.bz2 md5:abc604375d5e81c596e7198736570118 350.3 MB Download
StrOccCau_0.2_masked.fa.bz2 md5:52ee5125cf8f3091393ac4c4ce158ed6 334.2 MB Download
StrOccCau_1.0.gff.bz2 md5:73457fe1101093d918ba2759d432e8b4 3.9 MB Download
StrOccCau_1.0_alignments_of_light_associated_genes.txt md5:7e35483bc09b271fcb1755e5ec507976 3.3 MB Download
StrOccCau_1.0_mito.fa md5:daa5ac22907d92b874ca3a6bb5af79f6 21.6 kB Download
StrOccCau_1.0_nuc.fa.bz2 md5:2577ae43a263c3363b7b1b1fe45fb160 333.4 MB Download
StrOccCau_1.0_nuc_masked.fa.bz2 md5:8f54bce1bc95db9c9f82f92fa2ee0fa9 309.2 MB Download
StrOccCau_1.0_nuc_masked_SpottedBarredOwl_variant_file.vcf.bz2 md5:5e30c5e9e4b0479bc3e2392852f6096a 5.2 GB Download
StrOccCau_1.0_proteins.fa.bz2 md5:095c5929144926be748088d8baaf663f 5.2 MB Download
StrOccCau_1.0_RM_DeNovo_includes_LowComplexity.out md5:caa558f262329259d00208b7693a26be 10.3 MB Download
StrOccCau_1.0_RM_DeNovo_no_LowComplexity.out md5:94a1b7e5882d45c6f656b2144f288ccb 10.0 MB Download
StrOccCau_1.0_RM_homology_includes_LowComplexity.out.bz2 md5:ed998647e08b881a11f6f8261712a6f0 39.4 MB Download
StrOccCau_1.0_RM_homology_no_LowComplexity.out.bz2 md5:060133d073983270ec4a01ab4d6ac145 30.6 MB Download
StrOccCau_1.0_transcripts.fa.bz2 md5:e368d0840abb5519a7c7d1d2e5bdba87 7.9 MB Download
StrOccCau_GapCloser_output_NoContamNoMito.fa.bz2 md5:48901c3f81f238e63642df26290912c8 443.0 MB Download

Share

Cite as