PREDICTIVE MODELLING FOR DIABETES USING MACHINE LEARNING
Shriya Aishani Rachakonda1, Srinidhi Pudipedi2, T.S. Shiny Angel3
1Department of Computational Intelligence, SRM Institute of Science and Technology
2Department of Computing Technologies, SRM Institute of Science and Technology
3Department of Computational Intelligence, SRM Institute of Science and Technology
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Diabetes is a prevalent and long-lasting medical disorder that has significant consequences for health. It is important to diagnose diabetes promptly and accurately in order to effectively manage it. This study use machine learning algorithms to forecast the occurrence of diabetes by analyzing a dataset obtained from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). Important diagnostic characteristics such as the count of pregnancies, insulin levels, age, body mass index (BMI), and other health measurements are used. We utilize various supervised learning classification methods, including Logistic Regression, Support Vector Machines (SVM), Decision Trees, k-Nearest Neighbours (k-NN), and Random Forest, in order to create a reliable predictive model. The study entails thorough data preprocessing, meticulous feature selection, and rigorous model training to guarantee the precision and dependability of predictions. Performance indicators, such as accuracy, precision, recall, F1-score, and the Area Under the Receiver Operating Characteristic Curve (AUC), are employed to assess the efficacy of each algorithm. The objective of this research is to enhance the identification and treatment of diabetes at an early stage, hence enhancing the effectiveness of healthcare interventions. This effort aims to enhance predictive modelling in the field of diabetes using advanced machine learning techniques.
Key Words: Diabetes Prediction, Machine Learning, Logistic Regression, Support Vector Machines, Decision Trees, k-Nearest Neighbours, Random Forest, Predictive Modelling, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)