Organizing Open Data for DESY, HIFIS, NFDI and EOSC
Authors/Creators
Description
DESY is currently expanding on their Open Data infrastructure by establishing a tool bundle allowing for FAIR publication of data for a multitude of scientific communities. For reference, the experimental offers at DESY include over 200 experimental techniques at 21 synchrotron beamlines which have been made use of by over 3000 scientists during 2019. The amounts of data being produced are generally under the control of the respective experiment's primary investigators and are stored in the facility's storage systems for later re-use after the so-called embargo time has passed. During embargo time, the scientists who conducted the experiment are granted exclusive exploitation rights for the data. The policies of funding agencies and scientific journals foresee publication of scientific data under FAIR principles which is not as common and as extensive as would be necessary to make use of the obtained results.In order to emphasize our notion of importance for open and FAIR data publications, we deployed a metadata catalogue that delivers not only human- and machine-readable formats of metadata including a structured description of the experimental parameters for searchability but also concrete data locations. The latter aspect makes for ease of access to the data by either downloading it or even viewing it with additionally provided web services that allow for data exploration enabling scientists to check the data's usefulness before having to download potentially large datasets.Organizing the metadata in the catalogue is a task tackled by providing schema building blocks to the scientific communities allowing them to seamlessly integrate their community specific data descriptions into the catalogue schema while ensuring that the catalogue contents are up to standards through additional curation by the respective scientific communities themselves. Delivering these building blocks and supporting the construction of fully functional metadata schemata that overmore enable a substantial increase in metadata quality by implementing automated validation mechanisms against the schemata is one of the main aspects to be described in the talk.The organizational and functional dependencies of all other deployed services including DOI minting are finally described in an architectural overview that is offered as a blueprint to all institutions interested in establishing this suite of services themselves with the intention to create an added value to their scientists.
Files
T28_Wetzel-OrganizingOpenDataAtDESY-HMC-14052025.pdf
Files
(21.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:e9b5f2a6d26e9f81620b7bce9ea7fea7
|
5.9 MB | Preview Download |
|
md5:299b792f5ba57f354b54f4834024999c
|
15.5 MB | Download |
Additional details
Dates
- Available
-
2025-05-14Day of the presentation