Published June 10, 2025 | Version 1.0
Technical note Open

Tech Notes: Evaluation of Scalable Solutions for Time Series Database Streaming

  • 1. ROR icon University of Utah
  • 2. ROR icon Utah State University
  • 3. ROR icon University of North Carolina at Chapel Hill
  • 4. ROR icon Renaissance Computing Institute
  • 5. ROR icon University of Notre Dame
  • 6. EDMO icon California Institute of Technology
  • 7. ROR icon LIGO Scientific Collaboration

Description

This Tech Note presents an evaluation of scalable solutions for streaming time-series data, critical for real-time analysis in large-scale national research facilities like the NSF Laser Interferometer Gravitational-Wave Observatory (LIGO). The study assesses various time-series databases (ClickHouse, InfluxDB, TimescaleDB) and communication protocols (Kafka, Arrow Flight), focusing on query performance, data ingestion, and scalability. ClickHouse and Kafka emerged as preferred solutions, providing high performance and flexibility for environments with large-scale data requirements. The evaluation is based on use cases from facilities like LIGO, aiming to improve real-time data processing capabilities in NSF Major Facilities.

Notes (English)

This project is supported by the U.S. National Science Foundation Office of Advanced Cyberinfrastructure in the Directorate for Computer Information Science under Grant #2127548.

Files

Tech Notes_LIGO Data Streaming - Final.pdf

Files (1.0 MB)

Name Size Download all
md5:393911786d1568b2793e0f0563da37d0
1.0 MB Preview Download

Additional details

Funding

U.S. National Science Foundation
CI CoE: CI Compass: An NSF Cyberinfrastructure (CI) Center of Excellence for Navigating the Major Facilities Data Lifecycle 2127548