Lecture Transcription and Content Summarization using Core Speech Recognition and AI-Agents
Dr. S. Vidya Sagar Appaji
Department of Computer Science Raghu Institute of Technology, Visakhapatnam
Vajrapu Murali
Department of Computer Science Raghu Institute of Technology, Visakhapatnam
S Hemanth Srinivas
Department of Computer Science Raghu Institute of Technology, Visakhapatnam
V Manasa Gummudu
Department of Computer Science Raghu Institute of Technology, Visakhapatnam
Metta Ritesh Kumar
Department of Computer Science Raghu Institute of Technology, Visakhapatnam
I. INTRODUCTION
In the era of rapid digital transformation, educational institutions are increasingly adopting technology to en- hance learning experiences. This research presents an AI-driven lecture transcription and summarization sys- tem designed to convert spoken lectures into concise, well- structured PDF summaries, bridging the gap between lengthy lecture content and efficient knowledge retention. The proposed system leverages state-of-the-art speech-to- text models and a multi-layered intelligent agent archi- tecture, encompassing perception, decision-making, and action layers.
The perception layer captures and processes raw audio signals, extracting essential features and refining speech data for accurate transcription. The decision-making layer employs a large language model (LLM) [1] to distill key concepts, generate coherent summaries, and identify rele- vant references, ensuring contextual integrity and knowl- edge preservation. The action layer dynamically formats the refined content into a structured, accessible PDF doc- ument, ready for seamless distribution.
This approach not only streamlines knowledge acqui- sition but also reduces the cognitive load associated with reviewing extensive lecture recordings. It further enhances accessibility for students with diverse learning needs, pro- moting equitable access to educational resources. The system is engineered for real-time processing and itera- tive learning, continuously improving through feedback loops and model optimization. Our experimental evalu- ation indicates significant improvements in learning effi- ciency, comprehension, and content accessibility.