Neurolens: A Multimodal Real-Time Stress Detection System using Computer Vision and Speech Emotion Recognition





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 11
File Size 469.66 KB
File Count 1
Create Date 31/03/2026
Last Updated 31/03/2026

Download

Description

Neurolens: A Multimodal Real-Time Stress Detection System using Computer Vision and Speech Emotion Recognition

Shivam Pal1, Aryan Rajbhar2, Pritesh Patra3 , Yuvraj Rathod4 , Chaitali Mhatre5
1234 Student, Department of Computer Engineering, Universal College of Engineering, Kaman, Maharashtra, India, 5

Assistant Professor, Department of Computer Engineering, Universal College of Engineering, Kaman, Maharashtra, India

Abstract - Chronic psychological stress impairs cognitive performance, academic outcomes, and long-term well-being, yet most automated detection systems rely on a single sensing modality, limiting their robustness under real-world conditions. Unimodal approaches—whether vision-based, physiological, or acoustic—are individually vulnerable to noise, occlusion, and signal artifacts, motivating the need for integrated multimodal frameworks. This paper presents Neurolens, a real-time multimodal stress detection system that concurrently processes facial video through a fine-tuned You Only Look Once version 8 (YOLOv8) model trained on a publicly available facial emotion dataset, wearable physiological signals—including electrodermal activity (EDA), blood volume pulse (BVP), and skin temperature— through a hybrid convolutional neural network–long short- term memory (CNN-LSTM) architecture trained on the WESAD wearable stress and affect detection dataset, and speech audio through a Wav2Vec2 transformer-based speech encoder. A weighted late-fusion module integrates per- modality stress scores into a unified Stress Index rendered on an interactive real-time dashboard with adaptive push notifications and ambient brightness control. System demonstrations confirm correct identification of stress- indicative facial states such as anger and elevated physiological arousal from CSV-uploaded sensor data, alongside neutral baseline detection with appropriately reduced stress index values. These results establish Neurolens as a scalable, non-invasive, and reproducible framework for continuous passive stress monitoring in academic, clinical, and professional environments.

Keywords—multimodal fusion; Wav2Vec2; CNN-LSTM; facial emotion recognition; speech emotion recognition; wearable sensors.

Neurolens: A Multimodal Real-Time Stress Detection System using Computer Vision and Speech Emotion Recognition

Neurolens: A Multimodal Real-Time Stress Detection System using Computer Vision and Speech Emotion Recognition

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

Neurolens: A Multimodal Real-Time Stress Detection System using Computer Vision and Speech Emotion Recognition

Neurolens: A Multimodal Real-Time Stress Detection System using Computer Vision and Speech Emotion Recognition

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us