Domain Specific Chatbot Using RAG and Fine Tuning
1stMudumba Sri Akshit Sri Vastav
Department of Information Technology, Vardhaman College of Engineering (Autonomous),
Hyderabad-Telangana, 501286, India mudumbasriakshitsrivastav22it@student.vardhaman.org
3rdVabhanagiri Ujwal
Department of Information Technology,
Vardhaman College of Engineering (Autonomous), Hyderabad-Telangana, 501286, India sunanda@vardhaman.org
2nd Talla Ganesh Goud
Department of Information Technology, Vardhaman College of Engineering (Autonomous), Hyderabad-Telangana, 501286, India goudganesh790@gmail.com
4thMrs. Yadla Sunanda
Department of Information Technology,
Vardhaman College of Engineering (Autonomous), Hyderabad-Telangana, 501286, India samuelsam00402@gmail.com
Abstract—The objective of this project is to create a domain- specific assistant chatbot by combining fine-tuning and retrieval- augmented generation (RAG), enhancing the accuracy and con- textual understanding of conversational interactions. Fine-tuning a pre-trained model would allow for the exposure to domain- specific language so that the model can provide more relevant and accurate responses. With the addition of RAG, the chatbot system will be capable of real-time retrieval of external knowledge for its responses, which will make it more accurate and dynamic. The hybrid approach will impart a higher degree of complexity to the queries handled by the chatbot system, thus qualifying the chat interface to support technical and nontechnical users alike. The vector database will ensure that the chatbot remembers previous interactions, which will assist in making more consistent and informed responses over time. The system also employs natural language processing capabilities to gauge and analyze user intents, resulting in conversational and intuitive interaction styles. Scalable and adaptable, the assistant chatbot is then poised for industry-level applications, spanning data intelligence and customer support.
Index Terms—Retrieval augmented generation (RAG), Fine- tuning, Vector database, Large Language Models (LLMs), Small and Medium Enterprises (SMEs).