A Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 9
File Size 349.03 KB
File Count 1
Create Date 13/10/2025
Last Updated 13/10/2025

Download

Description

A Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN

Ms saraswathi 1 , Naveen M 2, Dhanush Kumar M 3, Jeyaseelapandi S 4,

Assistant professor , CST, SNS College of Engineering, Coimbatore – 641107. Email: saraswathi.r.cst@snsce.ac.in

Final Year, CST, SNS College of Engineering, Coimbatore – 641107. Email: mathavandhanush341@gmail.com

Final Year, CST, SNS College of Engineering, Coimbatore – 641107. Email: naveenm3109@gmail.com

Final Year, CST, SNS College of Engineering, Coimbatore – 641107. Email: jeyaseelapandi9626@gmail.com

ABSTRACT :

The “Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN” system is designed to bridge the communication gap between hearing-impaired and non-signing individuals through real-time sign language recognition. The proposed framework employs MediaPipe for efficient hand landmark extraction and a Convolutional Neural Network (CNN) for accurate gesture classification. The system translates American Sign Language (ASL) gestures into text and speech, providing a seamless and accessible mode of interaction. By combining real-time hand tracking, feature extraction, and deep learning-based recognition, the framework achieves high accuracy and low latency even under varying lighting and background conditions. This solution not only enhances communication accessibility but also supports inclusive human–computer interaction, enabling practical applications in education, healthcare, and assistive technologies.Keywords — MediaPipe, Convolutional Neural Network, Sign Language Recognition, Computer Vision, Accessibility The system translates American Sign Language (ASL) gestures into text and speech, providing a seamless and accessible mode of interaction. By combining real-time hand tracking, feature extraction, and deep learning-based recognition, the framework achieves high accuracy and low latency even under varying lighting and background conditions. This solution not only enhances communication accessibility but also supports inclusive human–computer interaction, This framework utilizes MediaPipe to extract hand landmarks consistently in various backgrounds and lighting conditions and employs a Convolutional Neural Network (CNN) to classify the gestures into their respective outputs. Users can benefit from this system because it provides real-time recognition, reduces dependency on interpreters, and improves accessibility. The method used in the development of this application is the Web Engineering method with stages of communication, planning, modeling, and deployment, supported by Python programming language with MediaPipe and TensorFlow frameworks, and tested using Black Box Testing

Keywords – MediaPipe, Convolutional Neural Network, Sign Language Recognition, Deep Learning, Computer Vision, Accessibility, Human–Computer Interaction.

A Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN

A Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

A Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN

A Vision-Based Computational Recognition Using MediaPipe Hand Landmarks and CNN

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us