Published December 4, 2025 | Version v1

Publishing Large Datasets: Experiences with Data Scaling and Hybrid Model Publication

  • 1. ROR icon Karlsruhe Institute of Technology
  • 2. ROR icon FIZ Karlsruhe – Leibniz Institute for Information Infrastructure

Description

The publication of large datasets poses significant challenges for research data management, particularly regarding storage, access, citability, and adherence to FAIR principles. We present our experience publishing a 25 TB dataset from the Infrared Atmospheric Sounding Interferometer (IASI) covering October 2014 to June 2019 (MUSICA IASI v3.2.1) using a hybrid approach. A representative one-day subset was deposited in the institutional repository RADAR4KIT with a DOI, while the complete dataset is archived at the Large Scale Data Facility (LSDF) provided by the Scientific Computing Center (SCC) at KIT and accessed via a THREDDS Data Server (TDS) hosted by IMKASF at KIT. We describe the workflow, metadata harmonisation, persistent linking strategy, and lessons learned from this publication process.

Files

kerzenmacher_ditrare_20251202_new.pdf

Files (968.8 kB)

Name Size Download all
md5:12791593563f2a26d1f62271dc3cdb2b
968.8 kB Preview Download

Additional details

Related works

Cites
Poster: 10.5281/zenodo.17222395 (DOI)
Journal article: 10.5194/essd-14-709-2022 (DOI)
Dataset: 10.35097/408 (DOI)

Funding

Leibniz Association
Leibniz Science Campus "Digital Transformation of Research" (DiTraRe) W74/2022