- Version
- Download 7
- File Size 323.44 KB
- File Count 1
- Create Date 20/03/2026
- Last Updated 21/03/2026
Violence Detection System Model Using CCTV Cameras on Public
Ms. T. Kowsalya
Department of Computer Science Engineering,
Jai Shriram Engineering College, Tirupur, India
Kowsalyathangavel10@gmail.com
Duraimurugan S, Department of Artificial Intelligence and Data Science,
Jai Shriram Engineering College, Tirupur,India, sanjaysekar43@gmail.com
Karthikeyan M,Department of Artificial Intelligence and Data Science,
Jai Shriram Engineering College,Tirupur,India, karthikeyanmanivel01@gmail.com
Mukesh Kumar U, Department of Artificial Intelligence and Data Science,
Jai Shriram Engineering College,Tirupur,India, mukeshsagar4456@gmail.com
Nathish Kumar R,Department of Artificial Intelligence and Data Science,
Jai Shriram Engineering College,Tirupur,India, nathishnathish690@gmail.com
1. Abstract
Public safety in densely populated urban environments demands intelligent and automated surveillance systems capable of real-time threat detection. The rapid proliferation of CCTV infrastructure across transportation hubs, shopping centers, schools, and public gathering spaces has not adequately addressed the fundamental challenge of timely and accurate violence detection. Traditional monitoring paradigms rely heavily on human operators, a process inherently vulnerable to fatigue, distraction, and inconsistent response latency. These limitations create critical gaps in public safety coverage, particularly during high-risk periods when continuous vigilance is most essential.
This paper presents the design and development of a Violence Detection System using CCTV Cameras in Public Spaces — a comprehensive automated framework leveraging deep learning to identify aggressive human behavior from real-time video streams. The proposed system integrates Convolutional Neural Networks (CNN) for spatial feature extraction with Long Short-Term Memory (LSTM) and ConvLSTM networks for temporal sequence modeling, enabling robust characterization of violent actions across consecutive video frames. An attention mechanism, specifically the Convolutional Block Attention Module (CBAM), is incorporated to focus model capacity on motion-relevant regions while suppressing irrelevant background content.
To further enhance classification reliability, a Hybrid Ensemble Model combining predictions from multiple deep learning architectures is proposed. A Source Credibility Scoring Mechanism evaluates and weights individual camera feeds based on estimated signal quality, while a Blocklisting Framework manages persistently unreliable sources. Evaluation is conducted on publicly available benchmark surveillance datasets using Accuracy, Precision, Recall, F1-Score, Area Under the ROC Curve (AUC), and Equal Error Rate (EER). Experimental results demonstrate that the proposed system achieves superior performance compared to existing methods, with an overall accuracy of 96.4% and AUC of 0.974.
Keywords: Violence Detection, CCTV Surveillance, Deep Learning, Convolutional Neural Network (CNN), LSTM, ConvLSTM, Spatio-Temporal Feature Extraction, Attention Mechanism, Public Safety, Smart City.






