Publishing Large Datasets: Experiences with Data Scaling and Hybrid Model Publication
Authors/Creators
Description
The publication of large datasets poses significant challenges for research data management, particularly regarding storage, access, citability, and adherence to FAIR principles. We present our experience publishing a 25 TB dataset from the Infrared Atmospheric Sounding Interferometer (IASI) covering October 2014 to June 2019 (MUSICA IASI v3.2.1) using a hybrid approach. A representative one-day subset was deposited in the institutional repository RADAR4KIT with a DOI, while the complete dataset is archived at the Large Scale Data Facility (LSDF) provided by the Scientific Computing Center (SCC) at KIT and accessed via a THREDDS Data Server (TDS) hosted by IMKASF at KIT. We describe the workflow, metadata harmonisation, persistent linking strategy, and lessons learned from this publication process.
Files
kerzenmacher_ditrare_20251202_new.pdf
Files
(968.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:12791593563f2a26d1f62271dc3cdb2b
|
968.8 kB | Preview Download |
Additional details
Related works
- Cites
- Poster: 10.5281/zenodo.17222395 (DOI)
- Journal article: 10.5194/essd-14-709-2022 (DOI)
- Dataset: 10.35097/408 (DOI)