Phishing Website Detection Using Deep Learning
Sushmita Prajapati1, Nitish Kumar Ojha2, Brajesh Raj3, Suryakant4
1M-Tech Scholar Computer Science and Engineering, IIMT University Meerut,
2,3 Assistant Professor, Computer Science and Engineering, IIMT University Meerut,
4Professor, Computer Science and Engineering, IIMT University Meerut,
Abstract
Phishing attacks are one of the most common ways cybercrimes such as the targeting of sensitive information, email, passwords, and bank detail occur. These are the fake websites designed to look just like real ones, which makes them hard to detect. In this research paper, we are using deep learning-based techniques to automatically detect phishing websites. This model analyzes three parts of a website using a combination of URL-based features, HTML code, and screenshot—to decide if it’s safe or dangerous. The proposed system supervised deep learning model is Convolutional Neural Network (CNNs) and Recurrent Neural Network (RRNs) to extract high-level semantic feature, outperforming machine learning algorithms in terms of accuracy, recall, and precision. This model is evaluated using standard benchmark phishing datasets, achieving over the 97% detection accuracy. Testing shows our model is accurate and works better than the traditional detecting methods.In this study, a hybrid deep learning architecture combining Convolutional Neural Networks (CNN) and Long Short-Terms Memory (LSTM) networks is developed. CNN layers are used to extract local patterns and structural features from URLs, while LSTM layers capture sequential dependencies within URL characters and website textual data. The extracted features are then passed through Dense (Fully Connected) layers to classify websites as either phishing or legitimate. The proposed model is trained on a large, balanced dataset collected from verified sources such as PhishTank, OpenPhish, and Alexa Top Sites to ensure robust performance.The experimental results demonstrate that the proposed CNN-LSTM-based model achieves high accuracy, precision, recall, and F1-score compared to traditional machine learning algorithms. The model also exhibits strong generalization capabilities, effectively identifying zero-day phishing attacks that are not present in the training data. The outcome of this research can significantly enhance web security systems by providing an automated, scalable, and intelligent approach to detect phishing websites in real time.
Keywords: Phishing Detection, Deep Learning, CNN, LSTM, Cyber Security, Machine Learning