SURVEY ON STABLE DIFFUSION TEXT TO IMAGE USING AI
Prof. Seema. R. Baji1, Ankush Amrutkar2, Unnati Rahane3, Sakshi Jagtap4, Kiran Bhoi5
1 Prof. Seema. R. Baji, Department of Computer Engineering, Late G. N. Sapkal College of Engineering, Nashik
2 Ankush Amrutkar, Department of Computer Engineering, Late G. N. Sapkal College of Engineering, Nashik
3Unnati Rahane, Department of Computer Engineering, Late G. N. Sapkal College of Engineering, Nashik
4 Sakshi Jagtap, Department of Computer Engineering, Late G. N. Sapkal College of Engineering, Nashik
5 Kiran Bhoi, Department of Computer Engineering, Late G. N. Sapkal College of Engineering, Nashik
---------------------------------------------------------------------***---------------------------------------------------------------------
ABSTRACT
The Fusion Nexus Text-to-Image Synthesis Initiative integrates cutting-edge Generative Adversarial Networks (GANs) with Natural Language Processing (NLP) techniques to narrow the semantic divide between textual input and visual output. Built upon the robust Stable Diffusion training paradigm, this initiative is engineered to produce immersive, true-to-life images based on descriptive text prompts. While GANs have exhibited potential in image generation, issues like mode collapse and training instability have impeded their effectiveness. The Fusion Nexus initiative circumvents these challenges by leveraging the Stable Diffusion framework, which furnishes a stable and reliable training methodology for GANs. By amalgamating recent advancements in deep learning, this project spearheads a novel approach to text-to-image synthesis. Its primary aim is to craft cohesive and highly realistic visual representations from textual descriptions, thereby bridging the gap between linguistic expression and visual perception. This ambitious undertaking marks a significant stride at the convergence of GANs and NLP, presenting a promising solution to the intricate task of text-to-image generation.
Keywords: Adversarial, Diffusion, Framework, Fusion, GANs, Generation, Generation, Image, Language, Natural, NLP, Processing, Robust, Stable, Text-to-image, Training, Visuals.