Published July 12, 2024 | Version v2
Presentation Open

Real-World Benchmarks for FAIR Data Repositories: Meeting the Needs for Modern Open Data

  • 1. Fedora
  • 2. Texas Advanced Computing Center, University of Texas at Austin

Description

Data repositories are fundamental infrastructure in the open science ecosystem, however traditional repository systems now face the challenge of keeping pace with the ever-growing and exponentially increasing scale of modern research data production. Currently, there is limited understanding of how an implementation involving Fedora 6.x in a High Performance Computing (HPC) environment may influence data scalability and functional efficiency of the repository.

This presentation will provide an in-depth look on on-going collaboration between the Fedora program team and data intensive computing and cloud developers at Texas Advanced Computing Center (TACC), to address the performance and scalability limits of Fedora 6.x in a high-performance computing (HPC) environment. Results of this collaboration will provide both the Fedora users and the repository community at large, with a better understanding of the scalability of a repository environment and how to assess it systematically. These crucial performance metrics will allow data repository technical, curatorial and administrative staff to understand how to optimize their infrastructure to meet the demand for management and access of large open data.

Files

Real-World Benchmarks for FAIR Data Repositories - Griffith & Field.pdf