Project deliverable Open Access
Pelletier, Eric; Corre, Erwan; Niang, Guita; Meng, Arnaud; Hoebeke, Mark; Finn, Robert
A new resource, METdb, has been established to house the data produced from the outputs of the assembly and annotation workflow runs on 489 transcriptomes. This new resource dramatically increases the representation of micro-eukaryotic organisms, both in this new database, as well as being propagated to core data resources such as ENA and UniProtKB. METdb provides a unique collection of eukaryotic gene annotations, and is expected to become an important reference collection fro the interpretation of marine metagenomics datasets.
For this deliverable, we developed two new pipelines for assembly and annotation of marine micro-eukaryotic transcriptomes. These have been converted to CWL (other work) leveraging many of the tool descriptions produced as part of DeliverableD6.3. Thereapplicationhighlightsthere-useofCWLtooldescriptions and the outputs of the Compute (WP4) and Interoperability platforms (WP5), and demonstrates how workflows can be used to make new data resources.
A web interface has been developed to provide users access to the data contained within METdb, and importantly exposed data that have previously languished in undiscoverable laboratory websites, increasing discoverability of this data.
While this represents an important new development, the marine micro-eukaryotic kingdom remains massively undersampled despite the sequence diversity. The annotations in METdb provide another resource for the discovery of novel enzymes of the biotechnology sector (as well as the new Microbial Biotechnology Community).