AI-Based Deepfake Scam Detection for Protection Against Voice and Video Fraud
Snehal Bagal
Assistant Professor
Department of Artificial Intelligence and Data Science All India Shri Shivaji Memorial Society’s,
Institute of Information Technology
Pune, India
Kanchan Shende Vidya Shendage
Department of Artificial Intelligence and Data Science Department of Artificial Intelligence and Data Science
All India Shri Shivaji Memorial Society’s, All India Shri Shivaji Memorial Society’s,
Institute of Information Technology Institute of Information Technology
Pune, India Pune, India
kanchanshende283@gmail.com shendagevidya7@gmail.com
Abstract—The proliferation of deepfake technology, under-pinned by Generative Adversarial Networks (GANs) and diffusion-based synthesis pipelines, has created an escalating cybersecurity crisis. These systems produce synthetic audio-visual content of sufficient fidelity to deceive human observers, and are increasingly leveraged in financial fraud, identity imper-sonation, and coordinated disinformation. This work delivers a structured technical survey of AI-driven detection methodologies spanning visual, acoustic, and behavioural modalities. Meth-ods are systematically compared with respect to architecture, benchmark performance, and operational limitations. Drawing on insights extracted from this survey, we formulate a conceptual three-stream detection architecture that unifies spatial artifact analysis, spectral audio verification, and temporal behavioural modelling via a Transformer-based cross-modal attention fusion mechanism. Our analysis demonstrates that multi-modal fusion consistently surpasses single-stream detection and exposes crit-ical open problems, including generalisation under distribution shift, robustness to adversarial manipulation, and edge-device deployment constraints.
Index Terms—Deepfake Detection, Artificial Intelligence, Cy-bersecurity, Voice Fraud, Video Fraud, Generative Adversarial Networks, Multi-Modal Learning, Survey, Machine Learning