Published February 24, 2025 | Version v1
Report Open

Overview of X-Ray Absorption Spectroscopy standards, vocabularies (and ontologies), data formats and practices

  • 1. ROR icon Cardiff University
  • 2. ROR icon Rutherford Appleton Laboratory
  • 3. ROR icon Science and Technology Facilities Council
  • 4. ROR icon Helmholtz-Zentrum Berlin für Materialien und Energie
  • 5. CODATA (Committee on Data of the International Science Council)
  • 6. ROR icon Committee on Data of the International Science Council

Description

X-ray Absorption Spectroscopy (XAS) research has expanded to become a set of widely used scientific methods with applications across Physics, Chemistry, Surface Science, Nanoscale Science, Biology, and Environmental and Earth Sciences. Over time, various scientific communities, research facilities, device providers, and software developers have created different formats and applications to store XAS data and describe it with metadata. These custom data formats serve their specific purposes, but they are not easily integrated or interoperable. Consequently, reusing XAS data from diverse sources—whether for further research, AI training, or the reproduction and replication of results—can be challenging. This is due to the diverse ways in which data is presented across domains and the limited availability of XAS data and metadata in public repositories. Using or combining these datasets often requires expert intervention and manual steps to map and process the data. Establishing commonly accepted standards for publishing XAS data is the first step in addressing these challenges.

This document provides a landscape analysis of current practices, standards, vocabularies, ontologies, schemas, and data formats used in the generation, curation, and publishing of X-Ray Absorption Spectroscopy (XAS) data, with a particular focus on efforts to create community standards that facilitate interoperability. We begin with an overview of current XAS techniques and their areas of application. Next, we discuss the development of custom formats for storing data and efforts to produce analysis techniques applicable regardless of data origin. We describe current efforts to generate standards that facilitate data interchange and integration, including various initiatives from research consortia aimed at creating interoperable formats, ontologies, and vocabularies. We conclude by observing the emerging consensus around using NXxas for multi-spectra raw and processed data and XDI for single spectra data.

This landscape analysis is part of the CDIF-4-XAS project, aimed at piloting an improved model for XAS data interoperability and reusability across scientific disciplines through the Cross Domain Interoperability Framework (CDIF) developed by CODATA as part of the WorldFAIR project. The CDIF-4-XAS project seeks to enable seamless integration of XAS data into data catalogues and analysis frameworks in a universally interoperable manner, making it easier to reuse, compare, and incorporate into larger studies or used for training AI applications. Therefore, we conclude by introducing the next phase of work, which will be to use CDIF to develop interoperability profiles to enable the integration and mapping of NXxas and XDI data: the CDIF-4-XAS interoperability model.

Files

CDIF-4-XAS-D1-Landscape_Analysis_Report.pdf

Files (5.2 MB)

Name Size Download all
md5:915df782bd00d0ef0cd9bb6963c54b8f
5.2 MB Preview Download

Additional details

Funding

European Commission
OSCARS - O.S.C.A.R.S. - Open Science Clusters’ Action for Research and Society 101129751

Dates

Issued
2025-03-03