Decoding Human Emotions Through Multimodal Analysis Using Deep Learning
Shanthini C1, Sukesh N D2, Arvinth U S3, Satheesh N P4
1Student, Department of Artificial Intelligence and Data Science,
Bannari Amman Institute Of Technology, Sathyamangalam,
2Student, Department of Artificial Intelligence and Data Science,
Bannari Amman Institute Of Technology, Sathyamangalam,
3Student, Department of Artificial Intelligence and Data Science,
Bannari Amman Institute Of Technology, Sathyamangalam,
4 Assistant Professor, Department of Artificial Intelligence and Data Science,
Bannari Amman Institute Of Technology, Sathyamangalam.
---------------------------------------------------------------------***---------------------------------------------------------------------
ABSTRACT
In the evolving landscape of emotion recognition, this study addresses the inherent shortcomings in existing technology. Recognizing the imperative role emotions play in human interaction, the need for a more nuanced and accurate multimodal emotion recognition system is evident. This research aims to refine multimodal emotion recognition by addressing existing challenges. The problem statement revolves around enhancing the accuracy and depth of emotion recognition through innovative methodologies. Leveraging Convolutional Neural Networks (CNN), Attention layers, and Bi-LSTM networks, the study utilizes a comprehensive approach. Data fusion involves combining audio extracted from video data, followed by fusion combinations of audio and text. The methodology incorporates SoftMax classifiers for feature recognition. The fusion of audio, text, and video data, coupled with the innovative use of CNN-LSTM and Attention layer networks, contributes to this success. The discussion interprets these results, highlighting the efficacy of the proposed approach in addressing the challenges of multimodal emotion recognition. The outcomes signify a substantial advancement in the field, providing a more comprehensive understanding of emotional expressions.