Dataset Open Access

Data associated with "Developing a standardized but extendable framework to increase the findability of infectious disease datasets"

Tsueng, Ginger; Alvarado Cano, Marco A.; Bento, José; Czech, Candice; Pache, Lars; Savidge, Tor C.; Starren, Justin; Rasmussen, Luke V.; Mengjia (Marjorie) Kang; Wu, Qinglong; Xin, Jiwen; Zhou, Xinghua; Su, Andrew I.; Wu, Chunlei; Brown, Liliana; Shabman, Reed S.; Hughes, Laura D.

Data associated with "Developing a standardized but extendable framework to increase the findability of infectious disease datasets"

 

Includes:

  • NIAID Dataset schema
  • NIAID ComputationalTool schema
  • Crosswalk between NIAID schemas and common schemas
  • Survey of Schema.org-compliant repositories


The open access movement and scientific reproducibility concerns have led the biomedical research community to embrace efforts to make scientific datasets openly accessible. While many datasets are now available, there are still challenges in ensuring that they are Findable, Accessible, Interoperable, and Reusable (FAIR). To improve the FAIRness of datasets, we evaluated dataset repositories for compliance with Schema.org standards – a collection of standards developed to increase metadata searchability across the internet. Adoption of the Schema.org Dataset standard was highly variable in biomedical research datasets, and the standard omitted many desirable metadata fields. We customized the Schema.org Dataset standard to catalog datasets collected across a Systems Biology research consortium consisting of 15 Centers. We developed a reusable process for creating a schema which is interoperable with other standards, but still extendable and customizable to a particular context. Here, we describe our process along with the associated gains in FAIRness, and discuss ongoing challenges with dataset discoverability – the first step to ensure that the vast amount of open data published by the research community is reused to its maximum value.

This work was supported in part by the National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) grants U01 AI124290 (Baylor: TS, QW), U01 AI124302 (Boston College: JB), U19 AI135995 (Scripps Research: MC, LDH, GT, AIS, CW, JX, XZ), U19 AI135964 (Northwestern: MK, LVR, JS), U19AI135972 (Sanford Burnham Prebys: LP); National Center for Advancing Translational Sciences NIH grant U24 TR002306 (Scripps Research: MC, LDH, GT, AIS, CW, JX, XZ); and National Institute of General Medical Sciences grant R01 GM083924 (Scripps Research: MC, GT, AIS, CW, JX, XZ). We acknowledge the NIAID/DMID Systems Biology Consortium for Infectious Diseases Data Dissemination Working Group for developing the NIAID schemas, registering center-created datasets and computational tools, and providing critical feedback on the manuscript. We thank Reed Shabman for his leadership within the Data Dissemination Working Group, coordinating with centers to register datasets and tools, and helpful comments and careful revisions of the paper. We additionally thank Liliana Brown for the support of the Program this paper originated from.
Files (642.4 kB)
Name Size
File 1 - Schema.org-compliant Open Data repositories.xlsx
md5:e1d2a475b2f40290183aea30721e4c3f
181.1 kB Download
File 2 - NIAID Systems Biology Dataset Schema and Crosswalk.pdf
md5:4668dfb9de81d576f6474cf9778ab24e
83.3 kB Download
File 2 - NIAID Systems Biology Dataset Schema and Crosswalk.xlsx
md5:cf57aa3e6f16b857b39ffedf24cf7055
81.4 kB Download
File 3 - Extended Dataset schema crosswalk to common schemas.xlsx
md5:cff4cd8be270cf3bda0c5607b42a85fa
53.9 kB Download
File 4 - NIAID Systems Biology ComputationalTool Schema and Crosswalk.pdf
md5:45f1760368f9f2bb08b83114c4b58e00
67.3 kB Download
File 4 - NIAID Systems Biology ComputationalTool Schema and Crosswalk.xlsx
md5:61f43967a862497fea90e46b25dc8684
56.8 kB Download
File 5 - Overview of datasets available in the DDE.pdf
md5:63140548827df9730fc3ba3a24224298
75.5 kB Download
File 5 - Overview of datasets available in the DDE.xlsx
md5:708851146df2be0cafaef70b2bca7674
11.1 kB Download
NIAID_schema.json
md5:30588dd9b5976d5fc619d636b31cad92
32.0 kB Download
115
50
views
downloads
All versions This version
Views 11553
Downloads 5043
Data volume 4.2 MB3.6 MB
Unique views 8546
Unique downloads 2926

Share

Cite as