TAIR functional annotation data
Contributors
Data curator:
Data managers:
Hosting institution:
- 1. Phoenix Bioinformatics
Description
Quarterly release of curated gene function data for Arabidopsis thaliana from The Arabidopsis Information Resource (www.arabidopsis.org)
The contents of the compressed archive include the following files which are described in detail in the included README file.
1.ATH_GO_GOSLIM.txt.gz
This document is a tab-delimited file containing GO annotations for Arabidopsis genes annotated by TAIR and TIGR with terms from the Gene Ontology Consortium controlled vocabularies (see www.geneontology.org). This file includes an updated set of literature based annotations and >40,000 electronic annotations based upon matches to INTERPRO domains supplied by Nicola Mulder from SWISS PROT/INTERPRO.
Please cite this paper when using TAIR's GO annotations in your research: Berardini, TZ, Mundodi, S, Reiser, L, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11.
2.gene_aliases_yyyymmdd.txt(.gz)
This file lists alternative names for each gene.
3.Locus_Germplasm_Phenotype_yyyymmdd.txt.gz
This file contains links between loci, germplasms, and phenotypes.
4.Locus_Published_yyyymmdd.txt.gz
This file contains links between loci and publications.
5.po_temporal_gene_arabidopsis_tair.assoc.gz
6. po_anatomy_gene_arabidopsis_tair.assoc.gz
These two files are tab-delimited files. Each contains the
set of literature-based annotations of Arabidopsis genes and loci annotated at TAIR to the terms from the Plant Ontology developed by the Plant Ontology Consortium (POC, www.plantontology.org).
7.TAIR10 or ARAPORT11_functional_descriptions_yyyymmdd.txt(.gz)
This file contains functional descriptions for gene models included in either the TAIR 10 or as of 20170630 the Araport11 genome release. TAIR10/Araport11 refers to the version of the genome annotation.
8. Araport11_GFF3_genes_transposons.MMMYYYY.gff.gz This document is a tab-delimited file in GFF format. This document contains annotations from Araport11 genome release. Annotations in this file include information curated from recent scientific literature. Note: This file is available starting with the 20211231 Data Release. Column header: explanation 1. Name of the chromosome 2. Source: Name of the the data source that generated this feature (Araport11) 3. Annotation type: eg gene, mRNA etc. 4. Start position of annotation. 5. Stop position of annotation. 6. Score - A floating point value. 7. Strand information. Defined as + (forward) or - (reverse). 8. Frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on. 9. Detailed annotation information with a semicolon-separated list of tag-value pairs, providing additional information about each feature, including curator summary, computational description,. etc. 9. Araport11_GTF_genes_transposons.MMMYYYY.gtf.gz This document is a tab-delimited file in GTF format. This document contains annotations from Araport11 genome release. Annotations in this file include information curated from recent scientific literature. Note: This file is available starting with the 20211231 Data Release. Column header: explanation 1. Name of the chromosome 2. Source: Name of the the data source that generated this feature (Araport11) 3. Annotation type: eg gene, mRNA etc. 4. Start position of annotation. 5. Stop position of annotation. 6. Score - A floating point value. 7. Strand information. Defined as + (forward) or - (reverse). 8. Frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on. 9. Detailed annotation information with a semicolon-separated list of tag-value pairs, providing additional information about each feature, including transcript_id. gene_id, Note, etc.
Files
Files
(58.6 MB)
Name | Size | Download all |
---|---|---|
md5:b7e7cf51184de80006eb97a40e1fc1e6
|
2.5 MB | Download |
md5:459b5436d325184c6e3fd01f70754e84
|
16.5 MB | Download |
md5:5b27f0dbe1d74e80f8857c3c9d4f06ee
|
6.6 MB | Download |
md5:4d89618f81170f22a80a7c0fb021476d
|
7.4 MB | Download |
md5:8d025490884baf4c5c74ca90e98cbf1f
|
371.8 kB | Download |
md5:9ea2dda2904e119855ea6f2863df2b62
|
755.7 kB | Download |
md5:fffb4c94a31529e871069e6f417f1ea3
|
2.2 MB | Download |
md5:ecea19c4cef159d6a665c8cf006d04a0
|
14.7 MB | Download |
md5:ddd0e9ae2299f662e671d7f300c41ced
|
7.5 MB | Download |
Additional details
Related works
- Is documented by
- https://www.arabidopsis.org/ (URL)
- Is identical to
- https://arabidopsis.org/download/index-auto.jsp?dir=%2Fdownload_files%2FPublic_Data_Releases%2FTAIR_Data_20230331 (URL)
Dates
- Issued
-
2024-06-30Data collected as of
References
- Berardini, TZ, Mundodi, S, Reiser, L, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11. DOI:10.1104/pp.104.040071
- Huala E, Dickerman AW, Garcia-Hernandez M, Weems D, Reiser L, LaFond F, Hanley D, Kiphart D, Zhuang M, Huang W, Mueller LA, Bhattacharyya D, Bhaya D, Sobral BW, Beavis W, Meinke DW, Town CD, Somerville C, Rhee SY. The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001 Jan 1;29(1):102-5. PubMed PMID: 11125061; PubMed Central PMCID: PMC29827.
Subjects
- Arabidopsis thaliana
- https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=3702