Published June 6, 2023 | Version v2
Lesson Open

Building lightweight FAIR data packages with Bioschemas and RO-Crate

  • 1. ROR icon University of Manchester
  • 2. ZB MED Information Centre for Life Sciences
  • 3. ROR icon Leiden University Medical Center

Description

Building lightweight FAIR data packages with Bioschemas and RO-Crate

Workshop at ELIXIR All Hands meeting 2023 (AHM2023), Dublin, Ireland

This document: DOI:10.5281/zenodo.10552615

 

Date: Tuesday 6 June 2023, 11:00 (local time, IST) 

Chairs:

  • Stian Soiland-Reyes (RO-Crate, ELIXIR-UK, BY-COVID, FAIR-IMPACT, EOSC-Life, EuroScienceGateway)

  • Leyla Jael Castro  (Bioschemas, ELIXIR ML FG, NFDI4DataScience, ZB MED Information Centre for Life Sciences)

  • Núria Queralt Rosinach (ELIXIR NL, ELIXIR ML FG, EJP RD)

Co-organizers:

  • José Mª Fernández (RO-Crate, ELIXIR-ES, EOSC-Life) 

  • Justin Clark Casey (ELIXIR-EMBL) 

  • Björn Grüning (ELIXIR-DE, Galaxy Europe, EuroScienceGateway) 

  • Nick Juty (Bioschemas, ELIXIR-UK, FAIRplus, BY-COVID, FAIR-IMPACT) 

Abstract: 

RO-Crate is a community effort to practically achieve FAIR packaging of research objects (digital objects like data, methods, software) with structured metadata. RO-Crate uses well-established Web standards and FAIR principles. For common metadata representations, RO-Crate builds on schema.org, a mature and general mark-up vocabulary used by search engines including Google Dataset Search. RO-Crate is adapted by many EU/EOSC projects as a pragmatic implementation of the FAIR Digital Objects vision.

Bioschemas are a set of opinionated profiles of schema.org types, improving findability of Web resources and data across life sciences, as well as domain-independent scientific objects. Bioschemas are developed in a community process and are deployed by multiple providers.

 

For life science researchers and data managers, these approaches are complementary: Bioschemas markup can be used within RO-Crate, and RO-Crate provides a long-term distribution/archive mechanism for datasets annotated with Bioschemas.

Both communities are focusing on further tooling on the consumption side: building knowledge graphs, validators and simplifying machine-actionable FAIR resources for researcher-driven workflows. Both approaches can combine general and domain-specific metadata, forming profiles of existing ontologies or ad-hoc vocabularies to further describe resources.

 

In this workshop, focused on data/metadata resources, we will provide an overview of RO-Crate and Bioschemas showing how these lightweight approaches to FAIR data publishing can be used by ELIXIR data providers and consumers, illustrated by how these methods are already used by ELIXIR members and European projects.

In a tutorial-like part we will use a running example, showing the thought process when going through the Bioschemas/RO-Crate recommendations in order to FAIRify the data – using a “just enough” metadata mentality.

We will show how domain experts can expand descriptions. We will include generic and domain-specific use cases, such as the minimal metadata model for synthetic data, created by members of ELIXIR-ML and ELIXIR-HD Focus Groups.

In the second half of the workshop we open the floor for discussions and gather feedback from early adopters on what ELIXIR users need to publish their outputs as FAIR Research Objects, helping future work on Bioschemas and RO-Crate to address any shortcomings in terms of tooling, training and documentation.

Workshop organisers represent ELIXIR Tools, Compute, Data, and Interoperability platforms. It is relevant to many projects and community-driven efforts aligned with ELIXIR (e.g. EOSC-Life, BY-COVID, EOSC4Cancer, EJP RD). Bring-your-own-Data workshops with RO-Crate and Bioschemas are being planned for Q2/Q3 2023 to go more in depth.

Agenda

Time (IST)

Tue 2023-06-06

11:00

Overview of FAIR data publishing with Bioschemas & RO-Crate

Speaker:  Stian Soiland-Reyes, Leyla Jael Castro

https://f1000research.com/slides/12-618

Source files at https://doi.org/10.5281/zenodo.10552449

11:15

A very brief introduction to making metadata with JSON-LD

Speaker: Stian Soiland-Reyes

https://f1000research.com/slides/12-619 

Source files at https://doi.org/10.5281/zenodo.10552449

11:20

Tutorial: FAIRify a dataset using just enough metadata

Speaker: Leyla Jael Castro
https://f1000research.com/slides/12-623 

Source files at https://doi.org/10.5281/zenodo.10552449

11:30

Tutorial: Packaging a dataset with its metadata as a RO-Crate

Speaker: Stian Soiland-Reyes

Link to material

11:40

Use case: Building a minimal metadata models for synthetic data

Speaker: Núria Queralt Rosinach
https://f1000research.com/slides/12-622 

11:50

Open discussion, feedback and requirements from early adopters 

Moderator: Leyla Jael Castro

12:20

Wrap-up and next steps

Lead: Stian Soiland-Reyes

 

Files

Building lightweight FAIR data packages with Bioschemas and RO-Crate.pdf

Files (327.0 kB)

Additional details

Funding

Deutsche Forschungsgemeinschaft
NFDI4DS - NFDI for Data Science and Artificial Intelligence 460234259
European Commission
BY-COVID – Beyond COVID 101046203