A Deep Learning Framework for Inappropriate Content Detection on YouTube
1Dr.G.Rajesh
Assistant Professor, Dept. Computer Science and Engineering Vignan’s Institute of Management and Technology for women
email: rajgundla@gmail.com
2Kandukuri Harshitha
UG Student, Dept. Computer Science and Engineering
Vignan’s Institute of Management and Technology for Women, Hyd. email: harshitha9146@gmail.com
4Poola Sreeja
UG Student, Dept. Computer Science and Engineering
Vignan’s Institute of Management and Technology for Women, Hyd. email: sreejapoola21@gmail.com
3Maroju Sreeya Bhavani
UG Student, Dept. Computer Science and Engineering
Vignan’s Institute of Management and Technology for Women, Hyd. email: msreeyabhavani@gmail.com
Abstract— The rapid increase of videos on YouTube has drawn in billions of viewers, most of whom are young. Some harmful uploaders see this platform as a chance to share disturbing visual content, like using animated cartoon videos to show inappropriate material to children. Because of this, it is strongly recommended to create an automatic real-time video content filtering system to be added to social media sites. To do this, the proposed system uses a convolutional neural network (CNN) model called EfficientNet-B7, which has been pre-trained on ImageNet, to gather video features. These features are then processed by a bidirectional long short-term memory (BiLSTM) network to learn useful video representations and perform multi- class video classification. These models were tested on a specially labeled dataset of 111,156 cartoon clips taken from YouTube videos. The results showed that EfficientNet-BiLSTM (with an accuracy of 95.66%) outperformed the attention mechanism- based EfficientNet-BiLSTM (with an accuracy of 95.30%). Additionally, traditional machine learning classifiers did not perform as well as deep learning classifiers. Overall, the EfficientNet and BiLSTM model with 128 hidden units achieved top performance (f1 score 0.9267). Moreover, comparing the performance with other leading approaches showed that BiLSTM on top of CNN captures better context information from video features in the network structure, resulting in improved outcomes for detecting and classifying unsuitable content for children in videos.
KeyWords: BiLSTM,Childsafety, EfficientNet-B7, Inappropriate content detection, multi-class classification, Video content filtering