Vision Script: An Intelligent System for Image- Based Text Processing and Visualization
Nilay Vartak
Risshiraj Pednekar
Preet Sontakke
Department of Computer Engineering
Atharva College of Engineering
Mumbai, India
vartaknilay-cmpn@atharvacoe.ac.in
pednekarrisshiraj-cmpn@atharvacoe.ac.in
preetsontakke-cmpn@atharvacoe.ac.in
Yogita Shelar
Department of Computer Engineering
Atharva College of Engineering
Mumbai, India
yogitashelar@atharvacoe.ac.in
Abstract — In the digital age, efficiently processing and understanding textual information from images is crucial for various applications, such as document analysis, research, and knowledge management. This project presents an integrated system for text extraction, summarization, and hierarchical visualization to streamline information retrieval and presentation. The rapid advancement of Optical Character Recognition (OCR) and Natural Language Processing (NLP) has enabled efficient extraction and summarization of text from images. This system automates the process by utilizing OCR techniques for text extraction, NLP models for generating concise summaries, and a hierarchical visualization framework to structure and present key insights. The visualization module plays a crucial role, employing tree-based structures to represent relationships between extracted concepts, making complex information easier to interpret. Interactive graph-based techniques such as D3.js and Graphviz enhance user experience by allowing dynamic exploration of summarized content. This method is very beneficial for document analysis, research insights, and knowledge management, allowing users to browse enormous amounts of material with ease and efficiency. The integration of structured visualization techniques not only improves readability but also enhances decision- making by presenting information in a logical and interactive format.
Keywords— Deep learning, Image processing, Optical character recognition, PyTessaract, Tensor flow, Python.