AllShowers Dataset
Description
This dataset consists of simulated calorimeter shower data for training and evaluating fast generative surrogate models for detector simulations in high-energy physics. It includes a variety of incident particle types, energies, and angles, providing a comprehensive resource for researchers in the field. All showers ware simulated on the Maxwell HPC cluster at DESY using the Geant4 toolkit and DD4hep framework with the ILD detector description in k4geo. The International Large Detector (ILD), a proposed detector for the International Linear Collider (ILC), serves as an example of a modern particle physics detector with high-granularity calorimeters.
The dataset consists of 2,000,000 simulated showers originating from 12 different particle types (electrons, positrons, photons, charged pions, charged kaons, k0L, (anti) neutrons, and (anti) protons) with energies uniformly distributed between 5 and 126 GeV. Data are provided as point clouds, where each point represents an energy deposition (Geant4 step) in an active calorimeter layer, characterized by its 2D spatial coordinates within the layer, layer index, and deposited energy. To keep the number of points manageable, energy depositions are clustered into a grid that is nine times finer than the actual calorimeter readout cell size. Points have been shifted to reduce the effect of different incident angles. For more details on the dataset generation and structure, please refer to the AllShowers paper.
The data is stored in HDF5 format without padding. A useful collation of api functions and command-line tools for handling the dataset can be found in the ShowerData repository. The layer_level.h5 file can be used when only layer-level information is required, such as the number of points and deposited energy per layer.
Files
Files
(78.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:9ab2374369d5058e74654cf03f9f1644
|
77.3 GB | Download |
|
md5:438c9ae0731cc79cf377fa3a49d72437
|
1.3 GB | Download |
Additional details
Related works
- Is supplement to
- Dataset: arXiv:2601.11716 (arXiv)
Funding
Software
- Repository URL
- https://github.com/FLC-QU-hep/ShowerData
- Programming language
- Python
- Development Status
- Active
References
- M. Frank, F. Gaede, C. Grefe and P. Mato; DD4hep: A detector description toolkit for high energy physics experiments, Journal of Physics: Conference Series 513(2), 022010 (2014) https://doi.org/10.1088/1742-6596/513/2/022010
- The Geant4 Collaboration: Geant4 – a simulation toolkit, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 506(3), 250 (2003) https://doi.org/10.1016/S0168-9002(03)01368-8
- Shaojun Lu, Frank Gaede, Andre Sailer, Dan Protopopescu et al: key4hep/k4geo (v00-24), Zenodo, (2025) https://doi.org/10.5281/zenodo.17726103