Vision Assist: AI-Powered Real-Time Image Captioning for the Visually Impaired





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 106
File Size 330.58 KB
File Count 1
Create Date 16/05/2025
Last Updated 16/05/2025

Download

Description

Vision Assist: AI-Powered Real-Time Image Captioning for the Visually Impaired

Mrs. N. Sree Divya 1, Avusula Bhavana2, Vanathadupula Ushasri3

1AssistantProfessor, Mahatma Gandhi Institute of Technology

2,3UG Student, Mahatma Gandhi Institute of Technology

Abstract: Recent advancements in image captioning technology have significantly improved the lives of people with visual impairments, promoting social inclusivity. Using computer vision and natural language processing, images become more accessible and understandable through textual descriptions. Notable progress has been made in developing photo captioning systems specifically for visually impaired users. However, challenges remain, such as ensuring the accuracy of automated captions and managing images with multiple objects or scenes. This study introduces a pioneering architecture for real-time image captioning based on a VGG16-LSTM deep learning model, supported by computer vision. The system has been built and implemented on a Raspberry Pi 4B single-board computer with GPU capabilities. This setup enables the automatic generation of suitable captions for images taken in real-time with a NoIR camera module, making it a convenient and portable solution for visually impaired individuals. The performance of the VGG16-LSTM model is assessed through extensive tests involving both sighted and visually impaired participants in various environments. The results reveal that the proposed system functions effectively, producing accurate and contextually relevant real-time captions. User feedback indicates a notable enhancement in understanding visual content, thereby aiding the mobility and interaction of visually impaired individuals within their surroundings. Multiple datasets were utilized, including Flick8k, Flickr30k, VizWiz captioning, and a custom dataset, for the training, validation, and testing of the model.

Keywords: image captioning technology, visual impairments, social inclusivity, computer vision, natural language processing (NLP), textual descriptions, photo captioning, accuracy, real-time image captioning, VGG16-LSTM deep learning model, portable solutions, automatic generation, extensive testing, contextually relevant captions

Vision Assist: AI-Powered Real-Time Image Captioning for the Visually Impaired

Vision Assist: AI-Powered Real-Time Image Captioning for the Visually Impaired

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

Vision Assist: AI-Powered Real-Time Image Captioning for the Visually Impaired

Vision Assist: AI-Powered Real-Time Image Captioning for the Visually Impaired

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us