AIR QUALITY MONITORING AND PREDICTIVE HEALTH RISK SYSTEM USING ENSEMBLE LEARNING
PALETI DEEPTHI¹, KUNCHALA VARA BRAHMA REDDY ², MAHA LAKSHMI N T³,
PANDARABOINA BHARADWAJ⁴, CHOPPARA MANGAMMA⁵
1Student, Department of CSE, Bapatla Engineering College, Bapatla 522101, AP, India
2Student, Department of CSE, Bapatla Engineering College, Bapatla 522101, AP, India 3Student, Department of CSE, Bapatla Engineering College, Bapatla 522101, AP, India 4Student, Department of CSE, Bapatla Engineering College, Bapatla 522101, AP, India
5Assistant Professor, Department of CSE, Bapatla Engineering College, Bapatla 522101, AP, India.
Corresponding author. E-mail: deepthipaleti13@gmail.com
Abstract— Air pollution is one of the most critical public health threats of the modern era, contributing to millions of premature deaths globally each year. Existing AQI monitoring systems are largely reactive and fail to provide future-looking, health-contextualized, or language-accessible alerts. This paper presents Air-O-Health, a web-based intelligent system that integrates real-time AQI retrieval, ensemble machine learning-based AQI forecasting, dual-standard health advisory generation (US-EPA and India-NAQI), and an AI-powered multilingual chatbot. The system retrieves live AQI data from the World Air Quality Index (WAQI) API for 27 major Indian cities and enables multi-month future AQI prediction using a Voting Regressor ensemble of Random Forest and XGBoost models. The models were trained on a merged dataset spanning 2015-2024 containing 47,796 records across 27 Indian cities, covering six key pollutants: PM2.5, PM10, NO2, CO, SO2, and O3. City-specific median imputation and historical monthly averaging form the preprocessing core. The FastAPI-powered backend supports piecewise-linear pollutant-to-AQI conversion across both Indian NAQI and US-EPA standards. Evaluation results show an ensemble R2 accuracy of 98.16%, with XGBoost achieving 99.85% and Random Forest 93.77%. The system is deployable on commodity hardware and is suited for smart city integration, driver advisory modules, and public health monitoring.
Keywords— AQI Prediction, Air Quality Index, Ensemble Machine Learning, XGBoost, Random Forest, FastAPI, Health Advisory, Smart Cities, Pollutant Forecasting, India-NAQI, US-EPA