Published September 30, 2024 | Version v27

TAIR functional annotation data

  • 1. Phoenix Bioinformatics

Contributors

Data curator:

Hosting institution:

  • 1. Phoenix Bioinformatics

Description

Quarterly release of curated gene function data for Arabidopsis thaliana from The Arabidopsis Information Resource (www.arabidopsis.org)

The contents of the compressed archive include the following files which are described in detail in the included README file.


1.ATH_GO_GOSLIM.txt.gz
This document is a tab-delimited file containing GO annotations for Arabidopsis genes annotated by TAIR and TIGR with terms from the Gene Ontology Consortium controlled vocabularies (see www.geneontology.org). This file includes an updated set of literature based annotations and >40,000 electronic annotations based upon matches to INTERPRO domains supplied by Nicola Mulder from SWISS PROT/INTERPRO. 

Please cite this paper when using TAIR's GO annotations in your research:  Berardini, TZ, Mundodi, S, Reiser, L, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11.  


2.gene_aliases_yyyymmdd.txt(.gz)
This file lists alternative names for each gene.


3.Locus_Germplasm_Phenotype_yyyymmdd.txt.gz
This file contains links between loci, germplasms, and phenotypes. 

4.Locus_Published_yyyymmdd.txt.gz
This file contains links between loci and publications. 

5.po_temporal_gene_arabidopsis_tair.assoc.gz
6. po_anatomy_gene_arabidopsis_tair.assoc.gz
These two files are tab-delimited files. Each contains the 
set of literature-based annotations of Arabidopsis genes and loci annotated at TAIR to the terms from the Plant Ontology developed by the Plant Ontology Consortium (POC, www.plantontology.org).


7.TAIR10 or ARAPORT11_functional_descriptions_yyyymmdd.txt(.gz)
This file contains functional descriptions for gene  models included in either the TAIR 10 or as of 20170630 the Araport11 genome release. TAIR10/Araport11 refers to the version of the genome annotation.
 

8. Araport11_GFF3_genes_transposons.MMMYYYY.gff.gz
This document is a tab-delimited file in GFF format.  This document contains annotations from Araport11 genome release. Annotations in this file include information curated from recent scientific literature.
Note:  This file is available starting with the 20211231 Data Release.

Column header: explanation
1. Name of the chromosome
2. Source: Name of the the data source that generated this feature (Araport11)
3. Annotation type: eg gene, mRNA etc.
4. Start position of annotation.
5. Stop position of annotation. 
6. Score - A floating point value.
7. Strand information. Defined as + (forward) or - (reverse).
8. Frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.
9. Detailed annotation information with a semicolon-separated list of tag-value pairs, providing additional information about each feature, including curator summary, computational description,. etc. 


9. Araport11_GTF_genes_transposons.MMMYYYY.gtf.gz
This document is a tab-delimited file in GTF format.  This document contains annotations from Araport11 genome release. Annotations in this file include information curated from recent scientific literature.
Note:  This file is available starting with the 20211231 Data Release.

Column header: explanation
1. Name of the chromosome
2. Source: Name of the the data source that generated this feature (Araport11)
3. Annotation type: eg gene, mRNA etc.
4. Start position of annotation.
5. Stop position of annotation. 
6. Score - A floating point value.
7. Strand information. Defined as + (forward) or - (reverse).
8. Frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.
9. Detailed annotation information with a semicolon-separated list of tag-value pairs, providing additional information about each feature, including transcript_id. gene_id, Note, etc. 

Files

Files (58.7 MB)

Name Size Download all
md5:9f6cb8e5a739f4004ecb01897abe9719
2.6 MB Download
md5:c72c998d26ef28180409370d74229a92
16.5 MB Download
md5:800f591ad2c73622927b4fa611283ffa
6.6 MB Download
md5:c7dc92fb332af78f946e0ed6d3afc2e5
7.3 MB Download
md5:184c4c8a8bfa81c33979f428d285ffb5
372.8 kB Download
md5:9e3c7abe748ba386b99f92a406fda7d4
759.0 kB Download
md5:9e1e5d4b4a951ca020ec758149aa1405
2.2 MB Download
md5:ad94ade91f0e466d9d41f1c18941797d
14.7 MB Download
md5:c339aad6f4643826147b8f64f35233ea
7.5 MB Download

Additional details

Funding

Phoenix Bioinformatics

Dates

Collected
2024-09-30
Data collected as of

References

  • Berardini, TZ, Mundodi, S, Reiser, L, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11. DOI:10.1104/pp.104.040071
  • Huala E, Dickerman AW, Garcia-Hernandez M, Weems D, Reiser L, LaFond F, Hanley D, Kiphart D, Zhuang M, Huang W, Mueller LA, Bhattacharyya D, Bhaya D, Sobral BW, Beavis W, Meinke DW, Town CD, Somerville C, Rhee SY. The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001 Jan 1;29(1):102-5. PubMed PMID: 11125061; PubMed Central PMCID: PMC29827.