Few-Shot Hindi Handwritten Text Recognition Using Meta-Optimized Res-Net-Transformer Networks





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 70
File Size 575.27 KB
File Count 1
Create Date 07/08/2025
Last Updated 07/08/2025

Download

Description

Few-Shot Hindi Handwritten Text Recognition Using Meta-Optimized Res-Net-Transformer Networks

A Md Tahseen Equbal

M.Tech Student, Department of Computer Science and Engineering
All Saints College of Technology, Bhopal, India
Affiliated to Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV)
mdtahseen278@gmail.com

B Prof. Sarwesh Site

Associate Professor, Department of Computer Science and Engineering
All Saints College of Technology, Bhopal, India
Affiliated to Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV)
er.sarwesh@gmail.com

ABSTRACT

Handwritten Text Recognition (HTR) in Indic scripts such as Hindi presents significant challenges due to diverse handwriting styles, limited annotated data, and the inherent complexity of Devanagari script. This dissertation proposes a novel few-shot learning framework titled RT-MAML (Res-Net-Transformer with Model-Agnostic Meta-Learning) to address these challenges. The architecture combines a Res-Net18 encoder for effective visual feature extraction and a Transformer-based decoder for sequence modeling, trained using a meta-learning strategy to enable rapid adaptation to new writers with minimal data. Unlike conventional CTC-based HTR systems, our model incorporates a cross-entropy- based sequence decoder with beam search, enhancing recognition accuracy. The approach is evaluated on the IIIT-HW Hindi handwritten word dataset and demonstrates a Character Error Rate (CER) of 6% and Word Error Rate (WER) of 13.0%, outperforming several existing baselines, including CNN- BiLSTM-CTC and Vision Transformer models. Meta-optimization through MAML improves the model's generalization to unseen writing styles, while techniques such as orthogonality regularization and curriculum learning further refine the learning process. The study highlights the potential of combining deep visual encoders, attention-based decoding, and meta-learning for low-resource script recognition tasks. Future work will explore expanding this framework to multilingual handwritten datasets and integrating large language models for semantic-aware decoding. This research contributes to the advancement of intelligent handwriting recognition systems in Indian languages under limited- resource settings.

Keywords: Handwritten Text Recognition (HTR), Hindi Script, Res-Net,Transformer Decoder, Connectionist Temporal Classification (CTC), Beam Search Decoding, Character Error Rate (CER), Word Error Rate (WER), Indic NLP, Deep Learning, Regularization, Script Complexity,

Few-Shot Hindi Handwritten Text Recognition Using Meta-Optimized Res-Net-Transformer Networks

Few-Shot Hindi Handwritten Text Recognition Using Meta-Optimized Res-Net-Transformer Networks

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

Few-Shot Hindi Handwritten Text Recognition Using Meta-Optimized Res-Net-Transformer Networks

Few-Shot Hindi Handwritten Text Recognition Using Meta-Optimized Res-Net-Transformer Networks

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us