Published December 19, 2023 | Version v1
Dataset Open

Towards metadata for machine learning - Crosswalk tables

  • 1. ZB MED Information Centre for Life Sciences
  • 2. NFDI4DataScience
  • 3. European Molecular Biology Laboratory (EMBL) Heidelberg
  • 4. ROR icon Fraunhofer Institute for Open Communication Systems
  • 5. Technical University of Berlin
  • 6. Free University of Berlin
  • 7. ROR icon Universidad Politécnica de Madrid
  • 8. Institute of Applied Biosciences, Centre for Research and Technology Hellas
  • 9. Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg
  • 10. AI4Life
  • 11. Department of Medical Informatics, Institute for Community Medicine, University Medicine Greifswald

Description

Crosswalks for Machine Learning models and datasets used for training

Here we present a collection of crosswalks for ML models (in TSV and XLSX formats) and datasets used for training (in TSV and XLSX formats). These crosswalks were created during an NFDI4DataScience hackathon organized by the Semantic Technologies team (SemTec) at ZB MED Information Centre for Life Sciences (ZB MED) with the aim of provindg a starting point for a common proposal towards a metadata schema for ML models based on schema.org.

Files

  • 2023.11.23 Metadata for ML - ML dataset Union.tsv: Crosswalks for datasets in TSV format
  • 2023.11.23 Metadata for ML - ML dataset Union.xlsx: Crosswalks for datasets in XLSX format
  • 2023.11.23 Metadata for ML - ML model union.tsv: Crosswalks for ML models in TSV format
  • 2023.11.23 Metadata for ML - ML model union.xlsx: Crosswalks for ML models in XLSX format

Files

2023.12 Machine Learning metadata.pdf

Files (395.5 kB)

Name Size Download all
md5:aaada349c2b0d43cf78893b90e257409
6.7 kB Download
md5:6e1ce29bbe711618800c0614df0eaad3
135.3 kB Download
md5:dad768f554836c899bb3bdd9a954964b
8.0 kB Download
md5:798533b2c955a6096c4467a04ee26c92
133.4 kB Download
md5:1f4c7aaed33aa86bf7eff507bfa5bc75
112.1 kB Preview Download

Additional details

Funding

NFDI4DS - NFDI for Data Science and Artificial Intelligence 460234259
Deutsche Forschungsgemeinschaft

References

  • Mitchell M, Wu S, Zaldivar A, Barnes P, Vasserman L, Hutchinson B, et al. Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency. 2019. pp. 220–229. doi:10.1145/3287560.3287596
  • Gray A, Castro LJ, Juty N, Goble C. Schema.org for Scientific Data. Artificial Intelligence for Science. WORLD SCIENTIFIC; 2022. pp. 495–514. doi:10.1142/9789811265679_0027
  • Guha RV, Brickley D, Macbeth S. Schema.org: evolution of structured data on the web. Commun ACM. 2016;59: 44–51. doi:10.1145/2844544
  • Walsh I, Fishman D, Garcia-Gasulla D, Titma T, Pollastri G, Capriotti E, et al. DOME: recommendations for supervised machine learning validation in biology. Nature Methods. 2021. doi:10.1038/s41592-021-01205-4