AI Based Multimodal Emotion and Behavior Analysis of Interviewee
Aaditya Jadhav1, Rushikesh Ghodake2, Karthik Muralidharan3 , G.Tarun Varma4, Prof. Vijaya Bharathi J.5
1-4Department of Computer Engineering & Pillai College of Engineering, Navi Mumbai, India
5Department of Computer Engineering & Pillai College of Engineering, Navi Mumbai, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - The COVID-19 epidemic has recently increased the popularity of virtual interviews, and globalization and technology have made them a popular option for hiring. Virtual interviews, however, provide more difficulties and problems for both interviewers and interviewees than conventional face-to-face interviews. The difficulty in comprehending the interviewee's behavioral features is one of the main issues with virtual interviews. There is a suggestion for a machine learning and deep learning-based method to detect and examine changes in the interviewee's behavior and personality features in order to address this problem. This strategy might remove interviewer bias and offer a more unbiased evaluation of interviewees. We offer a computational framework that counts the interviewee's communication-related performance and provides performance feedback based on the analysis of multimodal data like voice and facial expressions. Speech to text API is used to separate a video taken during the interview into audio and visual frames, as well as to extract text from the audio. The face is recognized from the visual frames, and emotions are assessed. Using machine learning and deep learning techniques, facial expressions are categorized as happy, fearful, sad, neutral, surprised, etc. Similar to this, a candidate's speech fluency is evaluated based on audio cues. The emotions of a candidate are discovered from the text. emotion, a candidate's speaking fluency, and the text's sentiment.
Key Words: Virtual Interview, Visual Frame, Facial Expression, Audio Cue, Speech fluency, Sentiment.