Review on Techniques of Gesture Navigation Control
Aswin R
Dept. of Computer Science and Engineering College of Engineering Kidangoor Kottayam, Kerala, India aswinrjuly2004@gmail.com
Bhavya S Kumar
Dept. of Computer Science and Engineering College of Engineering Kidangoor Kottayam, Kerala, India bhavyaskumar21@gmail.com
Boomika S
Dept. of Computer Science and Engineering College of Engineering Kidangoor Kottayam, Kerala,India boomika b22118cse b@ce-kgr.org
Thomas Jacob
Dept. of Computer Science and Engineering College of Engineering Kidangoor Kottayam, Kerala, India thomasjacobtj2003@gmail.com
Varsha Varghese
Dept. of Computer Science and Engineering College of Engineering Kidangoor Kottayam,Kerala, India varsha b22104cse b@ce-kgr.org
Linda Sebastian
Dept. of Computer Science and Engineering College of Engineering Kidangoor Kottayam, Kerala, India
lindasebastian@ce-kgr.org
Abstract—Recent advancements in Human–Computer Inter-action (HCI) have focused on developing multimodal systems that enable intuitive and contactless communication between users and machines. Traditional input devices such as keyboards and mice restrict accessibility and limit natural interaction, prompting research into gesture and voice-based interfaces. Numerous studies have explored vision-based gesture recognition using machine learning and computer vision frameworks like MediaPipe and OpenCV, enabling real-time detection of hand movements for cursor control, clicking, and scrolling actions. Similarly, speech recognition technologies leveraging deep learn-ing have evolved to convert spoken language into digital text with high accuracy, facilitating command execution and dictation. Integrating these two modalities—gesture navigation and voice recognition—enhances system adaptability and usability, partic-ularly for users with physical impairments or in touch-restricted environments. This survey indicates a growing shift toward hybrid interfaces that utilize built-in sensors such as cameras and microphones to achieve efficient, low-cost, and cross-platform operation. This convergence of gesture and speech modalities forms the foundation for developing real-time multimodal HCI systems capable of providing seamless, hands-free interaction without external hardware dependencies.
Keywords—Human–Computer Interaction (HCI), Gesture Recognition, Voice Recognition, Touchless Interaction