Published May 31, 2025 | Version v1

Transforming healthcare through cloud-native machine learning architecture: A case study in AWS, Spark, and Kubernetes Implementation

  • 1. Komodo Health, USA.

Description

This article examines a transformative case study in healthcare data infrastructure, where a skilled data engineer revolutionized operations by implementing an integrated technology stack with advanced machine learning capabilities. Facing challenges of processing diverse and voluminous patient data, the engineer architected a comprehensive solution leveraging AWS services, including S3, Redshift, and Lambda to create a cloud-based data lake optimized for AI workloads. This foundation was augmented with Apache Spark for distributed processing and MLlib for scalable machine learning, Hadoop clusters for specialized workloads, and Kubernetes for container orchestration—creating a flexible, resilient system capable of supporting sophisticated predictive models. The implementation featured automated ETL processes within a robust data pipeline alongside purpose-built feature stores and model serving infrastructure. A strategic combination of SQL and NoSQL databases provided flexible storage solutions optimized for various machine learning algorithms, from natural language processing for clinical notes to computer vision for medical imaging. Despite obstacles including data inconsistency and latency issues, the solution delivered substantial improvements in operational efficiency and clinical outcomes through AI-powered predictive capabilities, demonstrating the transformative potential of modern data engineering and machine learning approaches in healthcare settings. 

Files

WJARR-2025-1649.pdf

Files (564.1 kB)

Name Size Download all
md5:d4034a6d82de0a9a8701b921134b5326
564.1 kB Preview Download

Additional details