Genomic sequences and annotations for Solanum lycopersicum, Solanum pennellii and Solanum habrochaites
Description
=== Genome sequences ===
These are the different genome references (fasta formats) available for:
- Solanum lycopersicum:
- Solanum pennellii (one version only from Bolger et al., 2014) :
- Solanum habrochaites LA1777 (technology hotel project 2018):
- Solanum habrochaites PI127826:
- 2018 Hotel Project: PI127826.final.fasta
- 2021 Dovetails assembly: PI127826_hirise_assembly.fasta.gz
- Solanum habrochaites LYC4 (from the paper of Aflitos et al. 2014. 3rd assembly version):
- Solanum arcanum LA2172 (from the paper of Aflitos et al. 2014. 3rd assembly version):
- Solanum chilense LA3111 (from the paper of Stam et al. 2019, NCBI assembly ASM601370v1):
- Solanum lycopersicoides LA2951 (from the work of The Boyce Thompson Institute and RWTH Aachen University: link):
The two genome assemblies of S. habrochaites LA1777 and PI127826 were obtained through a combination of 10X Linked-Reads and BioNano Optical Mapping. This sequencing has been funded by the DTL Technology Hotel 2018 funding scheme.
=== Transcriptomes and proteomes ===
- Solanum lycopersicum (assembly 4.0):
- Transcriptome: ITAG4.0_cDNA.fasta
- Proteome: ITAG4.0_proteins.fasta
- Solanum pennellii (one version only from Bolger et al., 2014):
- Transcriptome: Spenn-v2-cds-annot.fa
- Proteome: Spenn-v2-aa-annot.fa
- Solanum lycopersicoides (version 1.0)
- Transcriptome: S_lycopersicoides_LA2951_v1.0_cds.fasta
- Proteome: S_lycopersicoides_LA2951_v1.0_proteins.fasta
- Solanum habrochaites PI127826
- Transcriptome: Solanum_habrochaites_PI12826_mRNAs.fasta (2018 Hotel Project assembly)
- Transcriptome (2021 Dovetails): Solanum_habrochaites_PI127826_CDS_Dovetails_2021.fasta
- Proteome (2021 Dovetails): Solanum_habrochaites_PI127826_protein_Dovetails_2021.fasta
=== Genome annotations files ===
Solanum lycopersicum Heinz1706
- ITAG2.4
- Gene File Format (GFF): ITAG2.4_gene_models.gff
- Gene Transfer Format (GTF): ITAG2.4_gene_models.gtf
- ITAG4.0
- Gene File Format (GFF): ITAG4.0_gene_models.gff
- General Transfer Format (GTF): ITAG4.0_gene_models.gtf
- MapMan annotation: S_lycopersicum_ITAG4.0_mapping_Mercator_v.3.6.tsv was obtained with Mercator 3.6 using the ITAG4.0_proteins.fasta file.
Solanum lycopersicoides LA2951
- Gene File Format: S_lycopersicoides_LA2951_v1.0_gene_models_all.gff3
Solanum habrochaites PI127826
- (Based on the 2018 Hotel Project assembly): a GFF file was produced using RepeatMasker and funannotate and is named Solanum_habrochaites_PI127826.gff3. The companion script with the performed steps is available in this data record as well and is called S_habrochaites_PI127826_funannotate_steps.sh
- (Based on the 2021 Dovetails Genomic project): Solanum_habrochaites_PI127826_gene_models.gff
Additional information:
- 2021 Dovetails Genomics complete assembly project report: dovetails_genomics_2021.tar.gz
- 2021 Dovetails Genomics complete annotation project report: dovetails_genomics_annotation_report_2021.tar.gz
Reference:
Tomato Genome Sequencing Consortium. 2012. The tomato genome sequence provides insights into fleshy fruit evolution. Nature volume 485, pages 635–641.
Bolger et al. 2014. The genome of the stress-tolerant wild tomato species Solanum pennellii http://www.nature.com/ng/journal/v46/n9/full/ng.3046.html
Hosmani et al. 2019. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. https://www.biorxiv.org/content/10.1101/767764v1
Aflitos et al. 2014. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole‐genome sequencing. https://onlinelibrary.wiley.com/doi/full/10.1111/tpj.12616
Stam et al. 2019. The de Novo Reference Genome and Transcriptome Assemblies of the Wild Tomato Species Solanum chilense Highlights Birth and Death of NLR Genes Between Tomato Species. G3: Genes, Genomes, Genetics December 1, 2019 vol. 9 no. 12 3933-3941; https://doi.org/10.1534/g3.119.400529
Files
Additional details
Funding
- Dutch Research Council
- Defence in the wild; from trichome transcriptomes and metabolomes to breeding tools for defence markers in tomato 2300178970