There is a newer version of the record available.

Published August 29, 2022 | Version 1.0.0
Dataset Open

Data associated with "Developing a standardized but extendable framework to increase the findability of infectious disease datasets"

  • 1. Scripps Research
  • 2. Boston College
  • 3. Sanford Burnham Prebys Medical Discovery Institute
  • 4. Baylor College of Medicine
  • 5. Northwestern University Feinberg School of Medicine
  • 6. National Institute of Allergy and Infectious Diseases

Description

Data associated with "Developing a standardized but extendable framework to increase the findability of infectious disease datasets"

 

Includes:

  • NIAID Dataset schema
  • NIAID ComputationalTool schema
  • Crosswalk between NIAID schemas and common schemas
  • Survey of Schema.org-compliant repositories


The open access movement and scientific reproducibility concerns have led the biomedical research community to embrace efforts to make scientific datasets openly accessible. While many datasets are now available, there are still challenges in ensuring that they are Findable, Accessible, Interoperable, and Reusable (FAIR). To improve the FAIRness of datasets, we evaluated dataset repositories for compliance with Schema.org standards – a collection of standards developed to increase metadata searchability across the internet. Adoption of the Schema.org Dataset standard was highly variable in biomedical research datasets, and the standard omitted many desirable metadata fields. We customized the Schema.org Dataset standard to catalog datasets collected across a Systems Biology research consortium consisting of 15 Centers. We developed a reusable process for creating a schema which is interoperable with other standards, but still extendable and customizable to a particular context. Here, we describe our process along with the associated gains in FAIRness, and discuss ongoing challenges with dataset discoverability – the first step to ensure that the vast amount of open data published by the research community is reused to its maximum value.

Notes

This work was supported in part by the National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) grants U01 AI124290 (Baylor: TS, QW), U01 AI124302 (Boston College: JB), U19 AI135995 (Scripps Research: MC, LDH, GT, AIS, CW, JX, XZ), U19 AI135964 (Northwestern: MK, LVR, JS), U19AI135972 (Sanford Burnham Prebys: LP); National Center for Advancing Translational Sciences NIH grant U24 TR002306 (Scripps Research: MC, LDH, GT, AIS, CW, JX, XZ); and National Institute of General Medical Sciences grant R01 GM083924 (Scripps Research: MC, GT, AIS, CW, JX, XZ). We acknowledge the NIAID/DMID Systems Biology Consortium for Infectious Diseases Data Dissemination Working Group for developing the NIAID schemas, registering center-created datasets and computational tools, and providing critical feedback on the manuscript. We thank Reed Shabman for his leadership within the Data Dissemination Working Group, coordinating with centers to register datasets and tools, and helpful comments and careful revisions of the paper. We additionally thank Liliana Brown for the support of the Program this paper originated from.

Files

NIAID_schema.json

Files (384.3 kB)

Name Size Download all
md5:30588dd9b5976d5fc619d636b31cad92
32.0 kB Preview Download
md5:933e16a9b709545cc08a2dc2fd9eece2
53.9 kB Download
md5:1816268d17ce26970eab33305220dc79
181.1 kB Download
md5:e65c24db4c874c14825d289991f24b58
60.5 kB Download
md5:61f43967a862497fea90e46b25dc8684
56.8 kB Download