Published December 17, 2020 | Version v1
Project deliverable Open

CINECA Cohort Level metadata Representation D3.1

  • 1. Simon Fraser University

Description

To support human cohort genomic and other “omic” data discovery and analysis across

jurisdictions, basic data such as cohort participant age, sex, etc needs to be harmonised. Developing a key “minimal metadata model” of these basic attributes which should be recorded with all cohorts is critical to aid initial querying across jurisdictions for suitable dataset discovery. We describe here the creation of a minimal metadata model, the specific methods used to create the minimal metadata model, and this model’s utility and impact. 

 

A first version of the metadata model was built based on a review of Maelstrom research data standards and a manual survey of cohort data dictionaries, which identified and incorporated overlapping core variables across CINECA cohorts. The model was then converted to Genomics Cohorts Knowledge Ontology (GECKO) format and further expanded with additional terms. The minimal metadata model is being made broadly available to aid any project or projects, including those outside of CINECA interested in facilitating

cross-jurisdictional data discovery and analysis.

Files

CINECA_D3.1_Cohort minimal metadata model_compiled.pdf

Files (1.0 MB)

Additional details

Funding

CINECA – Common Infrastructure for National Cohorts in Europe, Canada, and Africa 825775
European Commission