AI-Powered Virtual Furniture Try-On System: A Real-Time AR and Deep Learning Approach
1Mr. Abdan Wasullah Khan, 2Mr. Khan Anas Shaizad, 3Mr. Shaikh Rushan Shah Faisal, 4Mr. Shaikh Rahil Ayub,⁵Ms. Nameera chaudhary, ⁶ Ali karim Sayyed
1,2,3,4 Students, Department of AIML, [ANJUMAN-I-ISLAM A. R. KALSEKAR POLYTECHNIC], [NEW PANVEL]
5Project Guide, Department of AIML, [ANJUMAN-I-ISLAM A. R. KALSEKAR POLYTECHNIC], [NEW PANVEL]
6 HOD, Department of AIML, [ANJUMAN-I-ISLAM A. R. KALSEKAR POLYTECHNIC], [NEW PANVEL]
1abdan4u@gmail.com
2anaskhan937085@gmail.com
3rushanshk@gmail.com
4shaikhrahilmohd01@gmail.com
5nameera.choudhary@aiarkp.ac.in
6alikarim.sayed@gmail.com
Abstract—With the proliferation of the e-commerce sector, the need for sophisticated visualization technologies that provide spatial awareness beyond standard visual representations has grown paramount. Online buyers of furniture struggle to accurately assess dimensional compatibility, stylistic integration, and verisimilitude to physical reality. These problems result in high returns and reduced customer satisfaction. This study proposes an AI-driven Virtual Furniture Try-On System powered by a combination of Augmented Reality (AR) technologies and a deep learning hybrid pipeline capable of real-time photorealistic rendering of 3D furniture items in the video stream generated by the smartphone camera. The design features a ResNet-50 convolutional network architecture used for floor plane and surface recognition, along with a Transformer model responsible for dimensional accuracy and spatiotemporal consistency. A physics-based lighting sub-model ensures proper lighting synchronization between a virtual item and an observed scene. The software solution is written in Python 3.10 utilizing libraries from both TensorFlow and PyTorch frameworks. AR sessions are managed with ARCore and ARKit technologies. The evaluation conducted on ShapeNet and Pix3D data sets showed a mean IoU of 0.87, end-to-end processing latency of 0.18 s per frame, and user satisfaction rate of 94% according to a 120 participants experiment. Further work will include audio-visual multimodal feedback fusion.
Index Terms – Virtual Try-On, Augmented Reality, Convolutional Neural Networks, Real-time Rendering, Surface Identification, Lighting Estimation, Transformer Model, Spatial Consistency, E-Commerce Visualization, Deep Learning.