- Version
- Download 7
- File Size 500.84 KB
- File Count 1
- Create Date 09/09/2025
- Last Updated 09/09/2025
A Comprehensive Review on Hate Speech Detection using BERT and Transformer-based Architectures
Aishwarya Roy1, Prof. Sarwesh Site 2
1 M.Tech Student, Department of Computer Science and Engineering All Saints College of Technology, Bhopal, India
Affiliated to Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV) aishwarya.roy2811@gmail.com
2 Associate Professor, Department of Computer Science and Engineering All Saints College of Technology, Bhopal, India
Affiliated to Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV) er.sarwesh@gmail.com
Abstract - Hate speech detection has become a pressing challenge in the era of digital communication, as the rapid proliferation of offensive, abusive, and discriminatory content on social media platforms poses serious threats to individual well- being and societal harmony. Identifying such content is inherently complex due to linguistic ambiguity, sarcasm, implicit hate, cultural context, and multilingual variations, which often lead to misclassification by conventional systems. Earlier approaches based on machine learning with hand-crafted features and statistical models, or even deep learning techniques such as CNNs and LSTMs with static embeddings, have achieved limited success in handling these challenges. The emergence of Bidirectional Encoder Representations from Transformers (BERT) and its numerous variants has marked a paradigm shift in hate speech detection by providing deep contextualized word embeddings, bidirectional sequence modeling, and the ability to transfer knowledge across domains and languages. This review presents a comprehensive examination of BERT-based methods for hate speech and offensive language detection, analyzing their architectures, fine-tuning strategies, and adaptations such as RoBERTa, DistilBERT, ALBERT, XLM-R, and domain-specific models like HateBERT. A detailed discussion of benchmark datasets, evaluation metrics, and comparative performance across languages and platforms is provided, offering insights into the strengths and weaknesses of these models relative to traditional baselines. Moreover, the review identifies persistent challenges such as class imbalance, annotation subjectivity, dataset bias, low-resource languages, and the urgent need for explainability and fairness in automated moderation systems. Finally, it highlights emerging research directions, including multimodal hate speech detection (text, images, and video), cross-lingual and code-switched analysis, integration of large language models (LLMs) for contextual re-ranking, and bias mitigation strategies to ensure equitable performance. By consolidating recent advancements and open challenges, this study aims to serve as a foundational reference for researchers, practitioners, and policymakers working toward the development of robust, fair, and scalable hate speech detection systems powered by BERT and transformer-based architectures.
Keywords: Hate Speech Detection, BERT, Transformers, Offensive Language, Deep Learning, NLP, Multilingual Detection, Fairness, Explainable AI