Image Segmentation via Attention Module
Siddhartha Gupta , Shashank Shekhar Garg
Under guidance of Dr. Ankita Gupta
Computer Science Department, Maharaja Agrasen Institute of Technology
Abtract - Image segmentation is a crucial part of many systems for visual comprehension. Partitioning pictures (or video frames) into several parts or objects is necessary. Numerous applications, such as medical image analysis (such as tumour border extraction and measurement of tissue volumes), autonomous cars (such as navigable surface and pedestrian identification), video surveillance, and augmented reality, to name a few, all heavily rely on segmentation. Several approached have been used in the past, including Deep Convolutional Neural Networks. In this paper, we aim to achieve image segmentation of general
INTRODUCTION
The classification of pixels with semantic labels (semantic segmentation) or the division of distinct objects (instance segmentation) are two ways to formulate the problem of segmenting an image. Semantic segmentation performs pixel- level labelling for every image pixel using a set of object categories (such as human, car, tree, and sky), making it a more difficult task than image classification, which predicts a single label for the entire image. Instance segmentation expands the scope of semantic segmentation by identifying and separating each object of interest in the image (such as the division of distinct people).
Several techniques, including region growing, thresholding, watersheds, Otsu, k-means clustering, histogram-based clustering, graph cuts, and Markov random fields have been used in the literature for image segmentation. However, the majority of these earlier techniques segment objects using low-level features and cues.
Deep learning-based models have seen notable improvements in performance accuracy and time efficiency in recent years, leading to notable success. Numerous neural network- based techniques for object detection and classification have been developed by researchers.
A significant improvement in the performance of vision tasks has been made by convolutional everyday images via recently introduced attention modules, which have otherwise yielded optimal results in specific subset of biomedical images. We'll be employing an encoder-decoder architecture called U-Net. The main components of the encoder are a contracting path, also known as an encoder, which records the context of the image, and a symmetric expanding path, also known as a decoder, which enables precise location.
Keywords- Image Segmentation, Neural Network, Attention Module, U-Net