Fine-Tuning Small LLMs for High-Quality Semantic Search: A Cost-Efficient Alternative to Foundation Models





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 43
File Size 423.47 KB
File Count 1
Create Date 10/06/2025
Last Updated 10/06/2025

Download

Description

Fine-Tuning Small LLMs for High-Quality Semantic Search: A Cost-Efficient Alternative to Foundation Models

PURIPANDA SHARAT CHANDRA

1Artificial Intelligence And Machine Learning & R V College of Engineering

Abstract - Large language models (LLMs) have demonstrated remarkable performance in natural language understanding, yet their deployment for real-time semantic search and recommendation tasks remains impractical due to significant computational demands. This paper introduces a cost-efficient framework for fine-tuning small-scale models tailored for high-quality semantic movie recommendation. We leverage Gemma 3, a compact generative model, to produce enriched natural language descriptions of movies from structured metadata, and Granite Embedder, a lightweight transformer-based encoder, to compute dense vector representations for semantic similarity retrieval. Fine-tuning is performed using contrastive learning on curated triplet datasets derived from public movie data sources, enabling the model to learn meaningful semantic distances between similar and dissimilar movie entries.

Our pipeline developed using Python with Hugging Face Transformers, PyTorch, Qdrant supports end-to-end generation, embedding, and retrieval of semantically similar movies. All experiments were conducted on an AWS EC2 instance equipped with a 24 GB GPU, allowing for efficient training and inference at scale. We demonstrate a notable improvement in recommendation quality, with Recall@10 increasing from 0.56 to 0.81, and mean cosine similarity between relevant movie vectors improving from 0.43 to 0.72 after fine-tuning. A sample system output, such as “If you enjoyed avengers:age of ultron, you might love eternals for its mind-bending story and similar sci-fi execution,” showcases the model’s contextual sensitivity and domain-specific relevance. This research highlights a scalable, low-cost alternative to large foundation models for semantic search and recommendation tasks, effective as of June 2, 2025.

Key Words: Semantic Search, Fine-Tuning, Small Language Models, Vector Embeddings, Cost-Efficiency.

Fine-Tuning Small LLMs for High-Quality Semantic Search: A Cost-Efficient Alternative to Foundation Models

Fine-Tuning Small LLMs for High-Quality Semantic Search: A Cost-Efficient Alternative to Foundation Models

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

Fine-Tuning Small LLMs for High-Quality Semantic Search: A Cost-Efficient Alternative to Foundation Models

Fine-Tuning Small LLMs for High-Quality Semantic Search: A Cost-Efficient Alternative to Foundation Models

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us