Income Tax Fraud Detection with XGBoost and Real-Time ID Authentication
Rushi Dave1, Prof. Dr. Ruhin Kousar2
1-2 School of Computer Science and Engineering & Information Sciences, Presidency University, Bangalore, Karnataka
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Abstract
This paper, titled "Income Tax Fraud Detection Using AI-ML," explores the integration of Artificial Intelligence (AI) and Machine Learning (ML) to address the growing challenge of income tax fraud. Tax evasion poses significant threats to financial systems, and this study highlights the importance of leveraging advanced technologies for early detection and prevention.
The research focuses on developing predictive models using supervised learning algorithms, including Linear Regression, Decision Trees, Random Forest, Support Vector Machines (SVM), k-Nearest Neighbors (KNN), Gradient Boosting, Neural Networks, and XGBoost. Feature engineering techniques, such as label encoding and standardization, are employed to optimize model performance. Exploratory data analysis, outlier detection, and correlation analysis ensure dataset quality, while model evaluations using metrics like Mean Squared Error and R-squared provide insights into accuracy and reliability.
A user-friendly interface, implemented via Streamlit, allows users to input financial parameters for fraud detection. Additionally, the system incorporates advanced features to verify the authenticity of government-issued identification, ensuring IDs are genuine and not fraudulent. This added capability enhances the system’s robustness in detecting and mitigating income tax fraud.
XGBoost emerges as the best-performing model, achieving an outstanding accuracy of 0.9973, significantly surpassing the average accuracy of 0.7437 across other models. This research demonstrates the feasibility and effectiveness of predictive analytics combined with real-time authenticity verification, providing a comprehensive solution for strengthening financial systems against fraudulent activities.
Let me know if you'd like further adjustments!
.
Key Words - Income Tax Fraud Detection; Artificial Intelligence(AI); Machine Learning(ML); Predictive Models; Decision Trees; Random Forest; Support Vector Machine(SVM); k-Nearest Neighbors(KNN); Anomaly Detection; Gradient Boosting; Authenticity Verification.