- Download 15
- File Size 286.43 KB
- File Count 1
- Create Date 24/05/2025
- Last Updated 24/05/2025
Real-Time Deepfake Image Detection Using Deep Learning and VIT Architecture
Anandharaj R¹ , B.E ,Student Department of CSE, Angel College of Engineering and Technology, Tirupur, India
Mrs. P. Premadevi², Assistant Professor, Department of CSE, Angel College of Engineering and Technology, Tirupur, India
Abstract
The rapid advancement of artificial intelligence has led to the emergence of deepfake technology, enabling the creation of highly realistic synthetic images that closely mimic authentic visuals. This technological breakthrough, while impressive, poses significant threats in the form of misinformation, identity theft, political manipulation, and brand infringement. Deepfake image generation tools exploit deep learning algorithms to alter or fabricate images with minimal human oversight, making manual detection increasingly difficult. The need for accurate and automated detection systems has never been more critical in digital forensics and content authentication.This research presents an enhanced deepfake detection framework leveraging the Vision Transformer (ViT) architecture, a state-of-the-art deep learning model originally designed for image classification. Unlike traditional convolutional neural networks (CNNs) that focus on local spatial patterns, ViT utilizes self-attention mechanisms to analyze global image features, making it highly capable of identifying the nuanced inconsistencies and subtle artifacts typically introduced during deepfake generation.The proposed system Is trained and evaluated on a diverse dataset comprising both real and synthetic images, collected from popular benchmarks such as FaceForensics++ and Celeb-DF. During the preprocessing phase, standard image augmentation techniques are applied to increase dataset robustness. The ViT model is then fine-tuned using transfer learning and optimized with the AdamW optimizer and cross-entropy loss. Evaluation metrics such as accuracy, precision, recall, and F1-score confirm the effectiveness of the model, which significantly outperforms conventional CNN-based methods in detecting manipulated content.In addition to backend detection, a user-friendly graphical interface has been developed using Flask, enabling users to upload images and receive real-time deepfake analysis, including confidence scores and attention heatmaps for interpretability. The system not only facilitates efficient detection for forensic analysts but also empowers consumers and organizations to validate digital media authenticity.This study underscores the importance of integrating cutting-edge machine learning models like Vision Transformers in combating the rising threat of deepfakes. The results highlight the scalability, accuracy, and adaptability of the proposed framework, offering a reliable solution for real-world deployment in digital security applications.
Keywords
Deepfake, Vision Transformer, Fake Image Detection, ViT, Image Processing, Python, Machine Learning