Email Spam Detection Using LSTM with Attention Mechanism
Allam Bala Praneeth reddy1, Karrepu Hema Charan Reddy2, Marujolla Nihar Reddy3
12310030431@klh.edu.in, 22310030419@klh.edu.in, 32310030441@klh.edu.in
1Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Hyderabad-500075, Telangana, India.
2Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Hyderabad-500075, Telangana, India.
3Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Hyderabad-500075, Telangana, India.
Abstract: Spam email detection is an important domain of cybersecurity because spam emails usually contain phishing URLs, malicious attachments, or fraudulent advertising content. These emails not only interfere with communication but also lead to major threats like data robbery, money loss, and system compromise. The conventional rule-based spam filters have been ineffective in dealing with these issues as they lack high adaptability in handling changing spam tactics. In this research, we propose a deep learning-based solution for spam filtering with an LSTM network with an attention mechanism. The model is formulated to efficiently capture sequential and contextual information from email content through highlighting key words—like "urgent," "free," or "win"—that frequently occur in spam mail. A conventional dataset of pre-labelled spam and non-spam (ham) emails was utilized for training and testing. To preprocess the data, preliminary steps like tokenization, padding, and vectorization were involved. The architecture of the model involves an embedding layer, a temporal dependency-learning LSTM layer, and a dedicated attention layer for identifying contextually important words. The ultimate classification output is obtained via a fully connected layer with a sigmoid activation function. Experimental outcomes show that the attention mechanism LSTM greatly surpasses standard Algorithms utilizing machine learning on accuracy, precision, recall, and F1-score. The model also enables real-time classification, enabling users to provide emails and receive instant predictions. This paper establishes the merits of using LSTM in conjunction with attention mechanisms for resilient and adaptive detection of spam email, and paves the way for future improvement with transformer models or multilingual data.
Keywords
Spam detection, Email Spam Filtering, LSTM with Attention, Machine Learning, Email Classification, Cybersecurity.