LIGHTWEIGHT MULTIMODAL DEEP LEARNING FRAMEWORK FOR REAL-TIME SIGN LANGUAGE RECOGNITION ON EDGE DEVICES
Indhumathi Abishek S
Assistant Professor, Student,
Computer science and engineering Computer science and engineering
Angel college of engineering and technology Angel college of engineering and technology
Tiruppur Tiruppur
aabi24773@gmail.com
Vasanthkumar S Sivadharasan S
Student, Student,
Computer science and engineering Computer science and engineering
Angel college of engineering and technology Angel college of engineering and technology
Tiruppur Tiruppur
vasanthkumarsivasamy17@gmail.com er.siva2005@gmail.com
Nikshan A Student
Computer science and engineering Angel college of engineering and technology
Tiruppur rajar780495@gmail.com
ABSTRACT
Sign language recognition systems have witnessed significant advancements with the integration of deep learning techniques. However, existing systems still face major challenges such as high computational complexity, limited real-time performance, and lack of contextual understanding. Most traditional approaches rely heavily on Convolutional Neural Networks (CNNs) for feature extraction, which are effective for static gesture recognition but inadequate for handling dynamic and continuous gestures.
This paper proposes a lightweight multimodal deep learning framework designed to overcome these limitations by enabling real-time sign language recognition on edge devices. The
proposed system integrates gesture recognition with facial expression analysis and contextual text processing, thereby enhancing both accuracy and interpretability. By incorporating Long Short-Term Memory (LSTM) networks and transformer-based architectures, the system effectively captures temporal dependencies in sequential gestures.
Furthermore, optimization techniques are employed to reduce computational overhead, making the system suitable for deployment on mobile and embedded platforms. The use of multimodal datasets improves robustness and adaptability across different sign languages. Experimental analysis indicates that the proposed approach achieves improved accuracy, reduced latency, and better contextual understanding
compared to conventional methods. This work contributes to the development of scalable and intelligent assistive communication systems for the hearing-impaired community.
Key words: Sign Language Recognition, CNN, LSTM, Multimodal Learning, Edge Computing.