EXPLORE: A Scalable Infrastructure for LHC Open Data Analysis and FAIR Data Provisioning
Authors/Creators
Description
EXPLORE is a research data infrastructure deployed at the GoeGrid Compute Resource Center at the University of Göttingen, as part of the PUNCH4NFDI project. Funded by the German Research Foundation (DFG), PUNCH4NFDI aims to establish FAIR (Findable, Accessible, Interoperable, and Reusable) data management solutions and provide dynamically allocated compute resources for multiple physics communities.
The service integrates up to 200 CPU cores and operates on an HTCondor Overlay Batch System (OBS). A dedicated login node hosts both the Central Manager and the Submitter, coordinating job submissions and scheduling. Compute resources are provided through dynamically integrated virtual worker nodes, each equipped with 8 CPU cores, enabling flexible and efficient job execution.
To optimize resource allocation in real-time, EXPLORE uses COBalD/TARDIS, a modular and adaptive provisioning framework. COBalD acts as a flexible decision-making layer for managing heterogeneous resources. Paired with TARDIS, it enables real-time scaling of resources by launching or terminating worker nodes based on current workload demands. This adaptive approach ensures efficient utilization and responsiveness of the system under varying loads.
To enhance efficiency and reproducibility, the system utilizes containerized environments via CVMFS and Apptainer, offering users pre-configured operating systems and software tailored for LHC Open Data analysis. This ensures flexibility, scalability, and consistency across computational tasks. Real-time monitoring and resource tracking are implemented using Prometheus, Node Exporter, and Grafana, providing insights into system performance and pool utilization.
To support LHC Open Data analysis by the general public, an independent, standalone login node has been deployed. The PUNCH4NFDI project utilizes a federated Authentication and Authorization Infrastructure (AAI) based on OpenID Connect (OIDC) and the Helmholtz AAI, which serves as the standard access scheme for the broader PUNCH4NFDI platform.
To enhance accessibility to the EXPLORE infrastructure, an alternative registration mechanism has been made available for the time being. This allows users to authenticate independently, without relying on third-party identity providers. This interim solution ensures open access while ongoing efforts are focused on integrating EXPLORE access directly into the PUNCH4NFDI Portal as part of a long-term, sustainable access strategy.
Public users can register for access to EXPLORE at https://punchlogin.goegrid.gwdg.de/ using a valid email address.
After an initial alpha testing phase, the service entered beta testing with high school students participating in High-Energy Physics (HEP) Masterclasses in Lower Saxony. These tests led to optimizations in performance, accessibility, and usability. The infrastructure is now fully operational, supporting researchers, educators, students, and HEP enthusiasts in performing scalable and reproducible analysis of CERN Open Data.
EXPLORE promotes FAIR Science by ensuring the Findability, Accessibility, Interoperability, and Reusability of CERN Open Data across a wide range of users. Through dynamic resource allocation, containerized environments, and open-access registration, the system fosters open, reproducible, and collaborative research. This approach ensures that resources are accessible and reusable not only for the core PUNCH community but also for broader groups such as non-HEP researchers, educators, and those new to the field of high-energy physics.
This paper presents the technical architecture of the deployed infrastructure, including its integration with COBalD/TARDIS, access control mechanisms, and its operational impact since transitioning to production. Preliminary usage statistics, operational insights, and future directions for expanding interoperability and accessibility in research data infrastructure are also discussed.
Files
EXPLORE_CoRDI2025.pdf
Files
(1.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:30b0495af1745d011dddddeb1e9499d0
|
1.3 MB | Preview Download |
Additional details
Funding
- Deutsche Forschungsgemeinschaft