Cyberbullying Detection: Identifying Hate Speech Using Machine Learning In Real Time Chat Application
ALLU ANEESHA*1, KOTA HARSHITHA2, CHITTIBOMMALA SURYA3, GATTI HEMA SRIDEVI4, PITTU MADHU BRAHMA REDDY5
1Assistant Professor, Department of CSE(CB), Bapatla Engineering College, Bapatla 522101, AP,India
2Student, Department of CSE(CB),Bapatla Engineering College, Bapatla 522101,AP, India
3Student, Department of CSE(CB), Bapatla Engineering College, Bapatla 522101, AP, India
4Student, Department of CSE(CB), Bapatla Engineering College, Bapatla 522101, AP, India
5Student, Department of CSE(CB), Bapatla Engineering College, Bapatla 522101, AP, India
Abstract — Social media and online communications have become totally dominant in the digital age. However, with the growth of social media and the internet has come the emergence of both facets of bullying — cyberbullying and hate speech — online bullying that substantially negatively affects the psychological and social well-being of its victims, particularly among vulnerable populations. The goal of this paper is to provide a solution to automate the detection of cyberbullying and hate speech through machine learning techniques using natural language processing (NLP). The proposed system will utilize NLP techniques to identify relevant features of text data so that a number of classification algorithms, including logistic regression, support vector machines (SVMs), and several types of deep learning models (such as long short-term memory (LSTM) networks or BERT), can be trained and tested. A labeled dataset containing tweets and comments made on blogs will be used to create and evaluate the models, and the models will be evaluated based on accuracy, precision, recall, and F1 score. The results from this work will ultimately result in a comprehensive, scalable, and reliable means to identify abusive online content leading to a safer and more respectful online culture. Additionally, the result of this work could be incorporated into the social media sites to provide real-time monitoring and content moderation.
Key Words — Cyberbullying Detection, Hate Speech, Machine Learning, NLP, TF-IDF, SVM, LSTM, BERT, Text Classification, Content Moderation.