Intelligent Auto-Scaling in Azure Kubernetes Service Using AI-Based Predictive Workload Models for Healthcare Applications
Shailaja Beeram
Sbeeram1@gmail.com
Abstract
In today's healthcare environments, especially for ones that are motivated by digital transformation and remote patient engagement, scalability of infrastructure is key to providing predictable access to services like Electronic Health Records (EHR), telemedicine platforms, and analytics dashboards for health. Classic auto-scaling of cloud environments like Azure Kubernetes Service (AKS) relies heavily on reactive thresholds mostly CPU or memory utilization to scale in terms of provisioning/deprovisioning containers. These approaches typically struggle with unpredictable traffic of high variance characteristic in healthcare (e.g., telehealth spikes in flu season or pandemics).
This paper outlines an intelligent auto-scaling architecture with AI-driven predictive models specifically Long Short-Term Memory (LSTM) networks and Facebook’s Prophet to predict system demand. These models process past usage data from a simulated telehealth application and automatically invoke Kubernetes-based Event-Driven Autoscaler (KEDA) for adaptive management of pods. We have a real-case study scenario that emulates traffic surges in a digital health application and tests the system against conventional Horizontal Pod Autoscaler (HPA) approaches. Our results demonstrate improved response latency of 35% and cloud compute cost savings of 22% while sustaining 100% uptime. Our architecture met this twin need for resilience and cost-effectiveness in high-priority healthcare infrastructure and positions it as a building block for intelligent, scalable healthcare cloud solutions.
Keywords
Azure Kubernetes Service (AKS), Auto-scaling in cloud computing, Predictive scaling models, Healthcare cloud infrastructure, LSTM workload prediction, Prophet time-series forecasting, KEDA (Kubernetes Event-Driven Autoscaler), Telehealth optimization, AI in healthcare operations, Cloud-native architecture, HIPAA-compliant cloud solutions, MLOps in cloud scaling, Latency reduction in healthcare IT, Cost-efficient cloud resource management, Real-time digital health platforms