Privacy-Preserving Malware Detection Using Federated Learning and AI
Harshith G M1, Gopimohan Mukherjee2, K. C. Yashwanth3, Vijayalakshmi M. M4, B. Vijaya Nirmala5
1Harshith G M, Dept. of Information Science and Engineering, AMC Engineering College, Karnataka, India
2 Gopimohan Mukherjee, Dept. of Information Science and Engineering, AMC Engineering College, Karnataka, India
3 K C Yashwanth, Dept. of Information Science and Engineering, AMC Engineering College, Karnataka, India
4 Vijayalakshmi M. M, Dept. of Information Science and Engineering, AMC Engineering College, Karnataka, India
5 Vijaya Nirmala, Dept. of Information Science and Engineering, AMC Engineering College, Karnataka, India
Abstract – Malware threats have increased significantly with the rapid expansion of mobile devices, cloud platforms, and Internet-of-Things (IoT) ecosystems. Conventional malware detection systems typically rely on centralized machine learning models that require collecting large volumes of sensitive behavioral data from end-user devices. Such centralized data aggregation raises serious concerns related to privacy leakage, regulatory compliance, and user trust. To address these challenges, this paper presents a privacy-preserving malware detection framework based on Federated Learning (FL).
In the proposed approach, malware detection models are trained collaboratively across multiple client devices without transferring raw data to a central server. Each client locally trains a lightweight hybrid CNN–LSTM model using its private dataset and shares only encrypted or masked model updates. Secure aggregation and differential privacy mechanisms are incorporated to prevent reconstruction of individual client data and to provide formal privacy guarantees. The framework is evaluated using non-IID malware datasets distributed across multiple clients to simulate real-world deployment conditions. Experimental results demonstrate that the federated model achieves detection performance close to centralized training while significantly reducing data exposure. The findings indicate that federated learning offers a practical and scalable solution for privacy-aware malware detection in modern distributed environments.
Keywords: Federated Learning, Malware Detection, Privacy Preservation, Secure Aggregation, Differential Privacy, Edge AI.