There is a newer version of the record available.

Published June 9, 2017 | Version v1
Dataset Restricted

Supplemental dataset for Northern Spotted Owl (<i>Strix occidentalis caurina</i>) genome assembly version 1.0

  • 1. Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, California, United States of America; Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America; Department of Ornithology & Mammalogy, California Academy of Sciences, San Francisco, California, United States of America; Center for Comparative Genomics, California Academy of Sciences, San Francisco, California, United States of America
  • 2. Department of Ornithology & Mammalogy, California Academy of Sciences, San Francisco, California, United States of America; Center for Comparative Genomics, California Academy of Sciences, San Francisco, California, United States of America
  • 3. Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America; Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, California, United States of America; Department of Ornithology & Mammalogy, California Academy of Sciences, San Francisco, California, United States of America; Center for Comparative Genomics, California Academy of Sciences, San Francisco, California, United States of America
  • 4. Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, California, United States of America; Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
  • 5. UMR 7205 Institut de Systématique, Evolution, Biodiversité, CNRS, MNHN, UPMC, EPHE, Sorbonne Universités, Département Systématique et Evolution, Muséum National d'Histoire Naturelle, Paris, France; Department of Ornithology & Mammalogy, California Academy of Sciences, San Francisco, California, United States of America
  • 6. Runckel & Associates, Portland, Oregon, United States of America; Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, United States of America; Howard Hughes Medical Institute, Bethesda, Maryland, United States of America
  • 7. Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, California, United States of America
  • 8. Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, United States of America; Howard Hughes Medical Institute, Bethesda, Maryland, United States of America

Description

StrOccCau_1.0_nuc_masked.fa.bz2 : This FASTA format file compressed with bzip2 is the file that you will most likely want to download if you would like to perform an alignment to this genome assembly. This file is the repeat-masked assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt and also without the contigs and scaffolds that we identified either as the mitochondrial genome sequence or as contaminant sequences.

StrOccCau_1.0_nuc.fa.bz2 : This FASTA format file compressed with bzip2 is the file that we uploaded to the NCBI Whole Genome Shotgun (WGS) project database. This file is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt and also without the contigs and scaffolds that we identified either as the mitochondrial genome sequence or as contaminant sequences.

StrOccCau_1.0_mito.fa : This FASTA format file is the mitochondrial-genome-derived scaffold from the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012).

StrOccCau_1.0.gff.bz2 : This gff format file compressed this file with bzip2 contains the gene annotations of StrOccCau_1.0_nuc.fa.

StrOccCau_1.0_transcripts.fa.bz2 : This FASTA format file compressed this file with bzip2 contains the sequences of the gene transcript sequences of the genes annotated in StrOccCau_1.0.gff.

StrOccCau_1.0_proteins.fa.bz2 : This FASTA format file compressed this file with bzip2 contains the protein sequences of the genes annotated in StrOccCau_1.0.gff.

StrOccCau_1.0_RM_homology_includes_LowComplexity.out.bz2 : This file provides the repeat annotations produced by the homology-based masking of StrOccCau_1.0_nuc.fa that included masking of low complexity regions and simple repeats. We compressed this file with bzip2.

StrOccCau_1.0_RM_DeNovo_includes_LowComplexity.out : This file provides the repeat annotations produced by the de novo masking (which followed after first performing homology-based masking) of StrOccCau_1.0_nuc.fa that included masking of low complexity regions and simple repeats.

StrOccCau_1.0_RM_homology_no_LowComplexity.out.bz2 : This file provides the repeat annotations produced by the homology-based masking of StrOccCau_1.0_nuc.fa that did not include masking of low complexity regions and simple repeats. We compressed this file with bzip2.

StrOccCau_1.0_RM_DeNovo_no_LowComplexity.out : This file provides the repeat annotations produced by the de novo masking (which followed after first performing homology-based masking) of StrOccCau_1.0_nuc.fa that did not include masking of low complexity regions and simple repeats.

NSO-wgs-v1-alignments-of-light-associated-genes.txt : This file provides alignments of light-associated gene orthologs in NEXUS format.

StrOccCau_1.0_nuc_masked_SpottedBarredOwl_variant_file.vcf.bz2 : This is a raw, unfiltered variant call format file compressed with bzip2 that was generated after aligning both spotted owl and barred owl short read data aligned to StrOccCau_1.0_nuc_masked.fa.

StrOccCau_0.1.fa.bz2 : This FASTA format file compressed with bzip2 is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012).

StrOccCau_0.1_masked.fa.bz2 : This FASTA format file compressed with bzip2 is the repeat-masked assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012).

StrOccCau_0.2.fa.bz2 : This FASTA format file compressed with bzip2 is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt.

StrOccCau_0.2_masked.fa.bz2 : This FASTA format file compressed with bzip2 is the repeat-masked assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without any contigs and scaffolds less than 1,000 nt.

StrOccCau_GapCloser_output_NoContamNoMito.fa.bz2 : This FASTA format file compressed with bzip2 is the assembly output from SOAPdenovo2 toolkit GapCloser version 1.12-r6 (Luo et al. 2012) without the contigs and scaffolds that we later identified either as the mitochondrial genome sequence or as contaminant sequences.

References

Luo R., Liu B., Xie Y., Li Z., Huang W., Yuan J., He G., Chen Y., Pan Q., Liu Y., Tang J., Wu G., Zhang H., Shi Y., Liu Y., Yu C., Wang B., Lu Y., Han C., Cheung DW., Yiu S-M., Peng S., Xiaoqian Z., Liu G., Liao X., Li Y., Yang H., Wang J., Lam T-W., Wang J. 2012. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1:18. DOI: 10.1186/2047-217X-1-18.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.