Multi-Modal Features Representation-Based Convolutional Neural Network Model for Malicious Website Detection
1 Ms. Renuka B N 2 Priyanka N C
1Assistant professor, Department Of MCA, BIET, Davangere
2Student, 4th semester MCA, Department Of MCA, BIET, Davangere
ABSTRACT : Web applications have thrived across numerous business sectors, serving as essential tools for numerous of users in their daily lives activities. Though, several numbers of these applications are malicious which is major threat to Internet users as they can steal delicate information, connect malware, and propagate spam. Detecting malicious websites by analysing web content is ineffective due to the difficulty of extraction of the representative features, the huge data volume, the evolving nature of the malicious patterns, the stealthy nature of the attacks, and the limitations of traditional classifiers. Uniform Resource Locators (URL) features are static and can often provide immediate insights about the website without the need to load its content. However, prevailing solutions for sleuthing malicious web applications through web contented study often struggle due to complex feature extraction, massive data volumes, evolving attack patterns, and limitations of traditional classifiers. This study proposes a multimodal representation approach that fuses textual and image-based features to improve the recital of the malicious website detection. Textual features facilitate the deep learning model’s ability to understand and represent detailed semantic information related to attack patterns, while image features are effective in recognizing more general malicious patterns. In doing so, patterns that are hidden in textual format may be recognizable in image format. Deuce Convolutional Neural Network (CNN) replicas were constructed to excerpt the veiled structures from together textual and image represented features. Outcomes illustration the efficiency of the proposed model when related to other models. The overall performance in relations of Matthews Correlation Coefficient (MCC) was improved by 4.3% although the false positive rate was reduced by 1.5%.
Keywords: Matthews Correlation Coefficient (MCC)