Structured Motion Representation Learning for Robust Temporal Gesture Recognition
S. Vaishenaav
Department of Computer Networking (AIML) PSG Polytechnic College
Coimbatore, India vaishenaav@gmail.com
Shrijit S Iyengar
Department of Computer Networking (AIML) PSG Polytechnic College
Coimbatore, India shrijitiyengar07@gmail.com
A. R. Akash
Department of Computer Networking (AIML) PSG Polytechnic College
Coimbatore, India akashanandan006@gmail.com
R. Vinesh Kannah
Department of Computer Networking (AIML) PSG Polytechnic College
Coimbatore, India vineshkannah@gmail.com
S. Yajun
Department of Computer Networking (AIML) PSG Polytechnic College
Coimbatore, India yajun5713@gmail.com
S. Brindha
Supervisor, Head of the Department Department of Computer Networking (AIML) PSG Polytechnic College
Coimbatore, India hod.dcn@psgpolytech.ac.in
Abstract—Gesture recognition systems have achieved signif-icant progress through deep learning–based temporal modeling architectures; however, existing approaches primarily rely on raw spatial landmark representations and implicitly learned temporal dynamics. Such methods exhibit strong spatial bias and often fail to discriminate between gestures that share similar trajectories but differ in motion intensity, acceleration patterns, and temporal irregularity. This paper proposes a Physics-Informed Entropy-Regularized Motion Encoding (EMRE) architecture for robust temporal gesture recognition. The proposed framework explicitly models motion dynamics by computing velocity and acceleration from landmark sequences and deriving a physics-inspired motion
energy formulation defined as Et = vt 2 + λ at 2. To capture temporal complexity, an information-theoretic entropy measure is introduced over normalized energy distributions, enabling quan-
tification of motion irregularity. The resulting energy–entropy feature representation is provided to a temporal Long Short-Term Memory (LSTM) classifier for sequence-level inference. Unlike conventional coordinate-driven models, the proposed ar-chitecture reduces learning burden by embedding structured physical priors prior to temporal modeling. Experimental eval-uation on controlled multi-class gesture datasets demonstrates improved discriminative capability between dynamically similar gesture categories, enhanced Macro-F1 performance under class imbalance, and stable generalization across validation splits. The results indicate that integrating physics-informed motion modeling with entropy-based regularization provides a principled and computationally efficient foundation for next-generation robust gesture recognition systems.
Index Terms—Gesture Recognition, Motion Energy, Entropy Regularization, Temporal Modeling, Physics-Informed Learning, LSTM