Spam Message Filtering Using Machine Learning
Dharshanaa Sree T1, Gana Sri MS2, SwethaM. J3, Vishmitha. T4, Thamaraiselvi K5
Department of CSE, School of Engineering, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore 18.
22ueo014@avinuty.ac.in, 22ueo016@avinuty.ac.in, 22ueo057@avinuty.ac.in 22ueo062@avinuty.ac.in, thamaraiselvi_cse@avinuty.ac.in
ABSTRACT
Spam messages have significantly increased as a result of the quick growth of digital communication platforms, posing major security risks like phishing attacks, malware distribution, fraudulent links, and invasive ads. These risks have an adverse effect on system dependability, user privacy, and general communication effectiveness. The design and implementation of an intelligent spam message filtering system that can automatically identify and categorise spam in a variety of input formats, such as plain text messages, URLs, and uploaded documents, is presented in this mini-project. To find linguistic and probabilistic patterns frequently linked to spam content, the suggested system combines Natural Language Processing methods with the Naive Bayes machine learning algorithm. To improve classification performance and lower dataset noise, text pre-processing techniques like tokenisation, normalisation, stop-word removal, and TF-IDF feature extraction are used. A lightweight web application with an interactive and user-friendly interface is created to offer real-time predictions. Experimental findings outperform traditional rule-based filtering techniques in terms of accuracy, precision, and recall. The system can adjust to changing spam patterns and is scalable and computationally efficient. All things considered, the suggested solution enhances the security of digital communications, reduces undesired content, and provides a useful, affordable framework for automated spam detection in contemporary messaging environments.
Keywords: Web application, machine learning, TF-IDF, NLP, Naive Bayes, and spam detection.