Published February 3, 2026 | Version v1
Technical note Open

Data-Centric Reliability in Big Data Systems: An End-to-End Framework for Data Quality and Observability

Authors/Creators

  • 1. Amazon Web Services

Description

Data-centric reliability has emerged as a critical concern in modern big data systems, where the quality and trustworthiness of data directly impact analytical outcomes, machine learning model performance, and business decision-making. This comprehensive framework addresses the dual challenge of establishing theoretical foundations for data quality assessment while providing practical implementation strategies for end-to-end observability pipelines in production environments.
We synthesize insights from extensive peer-reviewed literature spanning theoretical frameworks, practical implementations, and real-world case studies to present a unified approach that bridges academic research and industry practice.
The framework introduces a multi-layered architecture integrating four core data quality dimensions—accuracy, completeness, consistency, and timeliness—with observability mechanisms across ingestion, processing, storage, and consumption layers. We establish formal definitions and measurement methodologies for each quality dimension while providing framework-agnostic principles that enable portability across diverse technology stacks. The practical implementation strategies encompass technology selection criteria, design patterns (Lambda, Kappa, microservices, data mesh), and deployment approaches for production environments.
Through four detailed case studies spanning mobile network analytics, cloud-based distributed databases, industrial IoT platforms, and smart building applications, we demonstrate measurable improvements in system reliability (up to 99.7%), data quality scores (96%), and operational efficiency (65% team productivity gains). Comprehensive benchmarking establishes performance baselines and evaluation metrics including throughput, latency, quality assessment scores, and business impact measures. The framework achieves a 340% ROI across implementations with significant reductions in data incidents and operational costs.
This work contributes to both theoretical understanding and practical application of data-centric reliability, offering researchers a rigorous foundation for further investigation while providing practitioners with actionable guidance for implementing robust quality and observability solutions in
production big data systems.
 

Files

zenodo_submission_data_centric_reliability.pdf

Files (6.2 MB)

Name Size Download all
md5:debcbe867148e1d79bf61f84f241da9c
6.2 MB Preview Download

Additional details

Identifiers

Other
OSF

Dates

Created
2026-02-03