Published April 19, 2026 | Version v1
Journal article Open

End-to-End Machine Learning Data Pipeline for Telecom Customer Churn Prediction

  • 1. Sreenidhi Institute of Science and Technology

Description

Predictive analytics has become a cornerstone of modern telecommunications, particularly in its ability to proactively manage customer churn. By identifying high-risk subscribers in real-time, providers can shift from reactive troubleshooting to strategic retention, significantly reducing revenue loss while simultaneously enhancing long-term customer lifetime value. This project introduces a specialized Customer Intelligence and Risk Optimization Platform—an AI-driven solution designed to be accessible yet technically robust. At its core, the system utilizes a high-performance Extreme Gradient Boosting (XGBoost) algorithm to uncover complex, nonlinear correlations between diverse data points such as customer tenure, billing patterns, and service subscriptions. The platform is built on a modular, micro-service architecture designed for seamless deployment and scalability. The trained XGBoost model operates as an inference service through a FastAPI RESTful framework, allowing it to process live, structured JSON requests with high efficiency. To ensure the system remains portable and ready for any infrastructure, the entire environment is containerized using Docker. On the front end, users interact with a sophisticated, SaaS-style interface built with Streamlit. This interactive dashboard provides a vivid, real-time look at consumer risk through color-coded classifications (Low, Medium, and High) and animated probability bars, making complex data immediately understandable for stakeholders. To further bridge the gap between raw data and business action, the platform integrates a Large Language Model (LLM) to enhance interpretability and decision-making. Rather than providing just a numerical score, the system features a conversational AI assistant that generates contextual, "pro-retention" strategies based on specific model results. These intelligent suggestions help stakeholders translate predictive insights into personalized customer outreach. By combining high-performance gradient boosting with conversational AI and real-time visualization, this architecture offers a comprehensive bridge between machine learning and practical customer service operations. The paper presents a complete, nine-stage system designed to turn complex predictions into practical action through an integrated data pipeline, interactive dashboard, and AI-powered assistant.

Files

end-to-end-machine-learning-data-pipeline-for-telecom-customer-churn-prediction-IJERTV15IS041350.pdf

Additional details