Published July 16, 2021 | Version 1.0
Dataset Open

A New Comprehensive Annotation of Leucine-Rich Repeat-Containing Receptors in Rice

Description

Datasets for preprint (https://doi.org/10.1101/2021.01.29.428842) entitled "A new comprehensive Annotation of Leucine-Rich Repeat-Containing Receptors in Rice".

This paper describes an in-depth manual curation of LRR-CR annotations including genes containing nonsense mutations, tagged as 'non-canonical', by opposition to 'canonical', that have expected gene models.

 

Contains 7 files for each rice cultivar:

- domain annotation for each LRR-CR protein (xxx_LRR_domains_filtered.csv)

- gff file containing LRR-CR gene model annotations

- fasta file (1): the complete genes 'gene' (nucleotide sequence, exons and introns)

- fasta file (2): the coding sequences 'CDS' (nucleotide sequence, exons, translatable). Note that for genes experiencing frameshift, the one or two bases that cause the frameshift are avoided in order to have nucleotide sequence that can be translated. Terminal and in frame stop codons are encoded by a '*'.

- fasta file (3): protein sequences 'PEP' (amino acid sequence, exons, corresponding to translation of CDS (2) )

- fasta file (4): 'cDNA' (nucleotide sequence, exons). Note that for genes experiencing frameshift, the translated protein sequence will not correspond to the one present in the file (3). For all other genes, the files "CDS" and "cDNA" retrieve the same nucleotide sequences.

- fasta file (5): 'cDNA_wFrameshit' (nucleotide sequence, exons) : same sequences than in (4) except for genes experiencing frameshift whose sequences are completed with one or two "!" characters at the position of the frameshift in order to conserve the right reading frame (also used by V. Ranwez et al. for the MACSE programs https://dx.doi.org/10.1093/molbev/msy159 ).

 

Files

LRR-CR-Oryza-sativa.zip

Files (10.1 MB)

Name Size Download all
md5:02e44ba6e4334071f3bb7930d1fde234
10.1 MB Preview Download
md5:9925695967d355637d71f62aeac17184
2.0 kB Preview Download

Additional details

Related works

Cites
Preprint: 10.1101/2021.01.29.428842 (DOI)