There is a newer version of the record available.

Published December 31, 2023 | Version TAIR_Data_20231231
Dataset Open

TAIR functional annotation data

  • 1. Phoenix Bioinformatics

Contributors

Data curator:

Hosting institution:

  • 1. Phoenix Bioinformatics

Description

Quarterly release of curated gene function data for Arabidopsis thaliana from The Arabidopsis Information Resource (www.arabidopsis.org)

The contents of the compressed archive include the following files which are described in detail in the included README file.


1.ATH_GO_GOSLIM.txt.gz
This document is a tab-delimited file containing GO annotations for Arabidopsis genes annotated by TAIR and TIGR with terms from the Gene Ontology Consortium controlled vocabularies (see www.geneontology.org). This file includes an updated set of literature based annotations and >40,000 electronic annotations based upon matches to INTERPRO domains supplied by Nicola Mulder from SWISS PROT/INTERPRO. 

Please cite this paper when using TAIR's GO annotations in your research:  Berardini, TZ, Mundodi, S, Reiser, L, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11.  


2.gene_aliases_yyyymmdd.txt(.gz)
This file lists alternative names for each gene.


3.Locus_Germplasm_Phenotype_yyyymmdd.txt.gz
This file contains links between loci, germplasms, and phenotypes. 

4.Locus_Published_yyyymmdd.txt.gz
This file contains links between loci and publications. 

5.po_temporal_gene_arabidopsis_tair.assoc.gz
po_anatomy_gene_arabidopsis_tair.assoc.gz
These two files are tab-delimited files. Each contains the 
set of literature-based annotations of Arabidopsis genes and loci annotated at TAIR to the terms from the Plant Ontology developed by the Plant Ontology Consortium (POC, www.plantontology.org).


6.TAIR10 or ARAPORT11_functional_descriptions_yyyymmdd.txt(.gz)
This file contains functional descriptions for gene  models included in either the TAIR 10 or as of 20170630 the Araport11 genome release. TAIR10/Araport11 refers to the version of the genome annotation.
 

7.Araport11_GFF3_genes_transposons.[DATE].gff.gz

8. Araport11_GFF3_genes_transposons.MMMYYYY.gff.gz
This document is a tab-delimited file in GFF format.  This document contains annotations from Araport11 genome release. Annotations in this file include information curated from recent scientific literature.
Note:  This file is available starting with the 20211231 Data Release.

Column header: explanation
1. Name of the chromosome
2. Source: Name of the the data source that generated this feature (Araport11)
3. Annotation type: eg gene, mRNA etc.
4. Start position of annotation.
5. Stop position of annotation. 
6. Score - A floating point value.
7. Strand information. Defined as + (forward) or - (reverse).
8. Frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.
9. Detailed annotation information with a semicolon-separated list of tag-value pairs, providing additional information about each feature, including curator summary, computational description,. etc. 


9. Araport11_GTF_genes_transposons.MMMYYYY.gtf.gz
This document is a tab-delimited file in GTF format.  This document contains annotations from Araport11 genome release. Annotations in this file include information curated from recent scientific literature.
Note:  This file is available starting with the 20211231 Data Release.

Column header: explanation
1. Name of the chromosome
2. Source: Name of the the data source that generated this feature (Araport11)
3. Annotation type: eg gene, mRNA etc.
4. Start position of annotation.
5. Stop position of annotation. 
6. Score - A floating point value.
7. Strand information. Defined as + (forward) or - (reverse).
8. Frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.
9. Detailed annotation information with a semicolon-separated list of tag-value pairs, providing additional information about each feature, including transcript_id. gene_id, Note, etc. 

Files

Files (58.5 MB)

Name Size Download all
md5:fc8135a933202f875c94eef3f87a1102
2.5 MB Download
md5:d72af6f0dfbed7c9ad4fe0cf22ebd0d2
16.6 MB Download
md5:95f5ab10abba0793949e52c0d60006d9
6.6 MB Download
md5:fde4839e0120e667eb29ddeda30c6098
7.3 MB Download
md5:afedb5772cfee32633ed44424b08de90
369.9 kB Download
md5:ac3e15d90170980129714e56554e32f8
752.7 kB Download
md5:aa6003604b786cde27c3f73c02c983a5
2.2 MB Download
md5:1625f31d84653a01c2c6b439f0c08cd2
14.6 MB Download
md5:da9c36ecf60caca13af19b7a18b60159
7.5 MB Download

Additional details

Dates

Issued
2023-12-31
Data collected as of

References

  • Berardini, TZ, Mundodi, S, Reiser, L, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11. DOI:10.1104/pp.104.040071
  • Huala E, Dickerman AW, Garcia-Hernandez M, Weems D, Reiser L, LaFond F, Hanley D, Kiphart D, Zhuang M, Huang W, Mueller LA, Bhattacharyya D, Bhaya D, Sobral BW, Beavis W, Meinke DW, Town CD, Somerville C, Rhee SY. The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001 Jan 1;29(1):102-5. PubMed PMID: 11125061; PubMed Central PMCID: PMC29827.