Automated Drift Detection and Retraining Pipeline for ML Models
Dr.Abhay .A. Deshpande
Department of Electronics and Communication
R.V. College of Engineering
Bengaluru, India abhayadeshpande@rvce.edu.in
Pramod Mattihalli
Electronics and Communication
R.V. College of Engineering
Bengaluru, India pramodm.ec21@rvce.edu.in
Rushil S Kumar
Electronics and Communication
R.V. College of Engineering
Bengaluru, India rushilskumar.ec21@rvce.edu.in
Vamsheesh K K
Electronics and Communication
R.V. College of Engineering
Bengaluru, India vamsheeshkk.ec21@rvce.edu.in
Namith G
Electronics and Communication
R.V. College of Engineering
Bengaluru, India namithgg.ec21@rvce.edu.in
Abstract—Machine learning (ML) models deployed in dy- namic, real-world environments are susceptible to performance degradation over time due to concept drift—the phenomenon where the underlying data distribution changes. This poses significant challenges to maintaining model reliability and pre- dictive accuracy in production systems. In this project, we propose a fully automated pipeline for drift detection and model retraining, designed to ensure sustained model performance with minimal human intervention. The pipeline leverages statistical drift monitoring techniques through Evidently AI to detect distributional changes in incoming data, generate actionable drift reports, and trigger retraining only when drift is significant and impacts performance. A Boolean logic-based trigger mechanism is used to initiate model retraining using recent data, followed by rigorous evaluation and comparison with the incumbent model. If the retrained model demonstrates improved performance, it is deployed into production using a controlled update strategy. The entire system is modular, scalable, and integrates seamlessly with MLOps workflows. This automated approach not only reduces operational overhead but also enhances model resilience, making it well-suited for applications in real-time analytics, IoT, and adaptive decision systems.
Index Terms—Concept drift, model retraining, automated machine learning, drift detection, MLOps, real-time monitoring, adaptive learning systems, Evidently AI, data stream mining, performance-aware retraining, model lifecycle management, edge computing, unsupervised drift detection.