Published October 1, 2019 | Version v1
Dataset Open

Proteome of the Ceratopteris richardii fern (strain Hn-n)

  • 1. The University of Texas at Austin
  • 2. Ulsan National Institute of Science and Technology

Description

Proteome derived from de novo transcriptome assembly from fronds, mature gametophytes and spores of Ceratopteris richardii Hn-n strain. Transcriptome is deposited at  https://www.ebi.ac.uk/ena/data/view/PRJEB33372

After removing low-quality reads (reads lacking all four nucleotides or with a no-call), we assembled transcripts with Velvet (version 1.2.06) and Oases (version 0.2.06) using each of five k-mer values (k=45, 55, 65, 75, 85). Also, we converted the .fastq file to a non-redundant .fasta file and performed separate de novo transcriptome assembly with k=35, 45, 55, 65, and 75. Assembled transcripts were combined for each tissue, then redundant or fragmented sequences were removed based on BLASTN analysis. We determined the translational reading frame and corresponding peptide sequences from each assembled transcript based on BLASTP mapping results (after 6-frame translation in silico) to four plant reference proteome databases (Creinhardtii_169, Osativa_193_pep, Smoellendorffii_91_pep, TAIR10). Sequences lacking significant BLASTP scores to the reference proteomes were considered to be non-coding and omitted from the resulting fern proteome database. The resulting protein sequences derived from the three tissues were combined and a non-redundant protein sequence set computed based on clustering with UCLUST (version 4.2.66), requiring >97% amino acid identity. The supporting code is available from the NuevoTx repository (https://github.com/taejoonlab/NuevoTx). 

This proteome was assembled for "A pan-plant protein complex map reveals deep conservation and novel assemblies"

 

 

Files

Files (30.9 MB)

Name Size Download all
md5:4ea82327e7d64115e5dcfe47bb3a3d7c
30.9 MB Download