Smart Policing with Reinforcement Learning: A Hybrid Framework for Crime Forecasting and Resource Optimization
* C.V.RAMAN GLOBAL UNIVERSITY
1st Ipsit Swaroop Bose
dept. of Computer Science Engineering C.V.RAMAN GLOBAL UNIVERSITY
2201020764@cgu-odisha.ac.in
22010120764
4th Monotosh Ghosh
dept. of Computer Science Engineering C.V.RAMAN GLOBAL UNIVERSITY
2201020765@cgu-odisha.ac.in
22010120765
2nd Dr. Nikita Naik
dept. of Computer Science Engineering C.V.RAMAN GLOBAL UNIVERSITY
nikita.naik@cgu-odisha.ac.in
5th Souvik Banerjee
dept. of Computer Science Engineering C.V.RAMAN GLOBAL UNIVERSITY
2201020656@cgu-odisha.ac.in
22010120656
3rd Riya Nanda
dept. of Computer Science Engineering C.V.RAMAN GLOBAL UNIVERSITY
2201020831@cgu-odisha.ac.in
22010120831
6th Dr.Bichitrananda Behera
dept. of Computer Science Engineering C.V.RAMAN GLOBAL UNIVERSITY
Abstract—Ensuring public safety and maintaining effective law enforcement have become increasingly complex due to the dynamic and unpredictable nature of criminal activity. This research proposes a two-tier intelligent framework that integrates Ensemble Learning with Deep Reinforcement Learning (DRL) to enhance both crime hotspot forecasting and adaptive police resource allocation.
The first component introduces an Ensemble-based Crime Pre- diction Model, which fuses multiple reinforcement learning tech- niques—Deep Q-Network (DQN), Double DQN, Prioritized Ex- perience Replay (PER-DQN), Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), Proximal Policy Optimization (PPO), and Multi-Agent Reinforcement Learning (MARL)—into a unified adaptive ensemble. This synergy strengthens predictive reliability by combining the explorative depth of value-based agents with the strategic flexibility of policy-driven methods, enabling accurate detection of evolving crime patterns across districts.
The second component presents an Intelligent Police Allocation Model constructed using a hybrid PPO–MARL–Actor–Critic architecture. Leveraging live data such as district-wise crime den- sity and officer availability, the model autonomously determines optimal deployment strategies to minimize criminal incidents and improve operational efficiency.
Comprehensive evaluations confirm that the ensemble predic- tion mechanism achieves superior accuracy in identifying crime- prone zones, while the adaptive allocation model enhances the equitable distribution of law-enforcement resources. Collectively, these integrated models establish a scalable, data-driven frame- work for proactive crime prevention and strategic policing.
Index Terms—Reinforcement Learning, Crime Prediction, Multi-Agent Systems, Proximal Policy Optimization, Actor- Critic, Police Resource Allocation, Artificial Intelligence.