Health Prediction System
Mr. Pradip Bhakare1, Aditya Arabt2, Ayush Ingel3, Ankush Kharate4, Nikhli Chitode 5
1Asst.Professor, Department of ENTC, MGICOET Shegaon, Maharashtra.
2,3,4,5 Students, Department of ENTC, MGICOET Shegaon, Maharashtra.
***
Abstract - Chronic diseases like diabetes, heart disease, breast cancer, and Parkinson’s remain major global health challenges. Early detection is crucial to reducing long-term health risks. This research introduces a Health Prediction System that combines machine learning (ML) with web technologies to deliver real-time disease risk assessments. Users can either manually input clinical parameters or upload medical reports, which are processed using Optical Character Recognition (OCR) techniques. The system features a modular design with four dedicated ML models: Support Vector Machine (SVM) for diabetes and Parkinson’s, and Logistic Regression for heart disease and breast cancer. Models were trained on trusted datasets, including the PIMA Indian Diabetes dataset, UCI Cleveland Heart dataset, Wisconsin Breast Cancer dataset, and Parkinson’s voice dataset. A Flask-powered backend handles input routing and prediction, achieving an average response time between 200–300 milliseconds. A dual-input mechanism enhances flexibility—users can type data manually or extract it automatically from scanned reports using Tesseract OCR and OpenCV preprocessing. The intuitive web interface, built with HTML, CSS, and JavaScript, offers immediate feedback with built-in validation to ensure data integrity. Performance evaluation, based on accuracy, F1-score, and ROC-AUC, showed that the breast cancer model achieved the highest accuracy (97.1%), while Parkinson’s and heart disease models also delivered strong results. Designed for offline use, the system prioritizes user privacy and accessibility. This paper details the system's design, implementation, and broader applications in home healthcare, mobile clinics, and underserved communities, demonstrating the potential of machine learning and OCR technologies to enhance early diagnosis and health awareness.
Key Words: Health Prediction, Machine Learning, OCR, Flask, SVM, Diabetes, Breast Cancer, Parkinson’s