Common Object Segmentation in Dynamic Image Collection using Attention Mechanism

Baid, Sana

Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/1093

Title:	Common Object Segmentation in Dynamic Image Collection using Attention Mechanism
Authors:	Hati, Avik Baid, Sana
Keywords:	Semantic segmentation Computer vision Dynamic image PASCALVOC
Issue Date:	2022
Publisher:	Dhirubhai Ambani Institute of Information and Communication Technology
Citation:	Baid, Sana (2022). Common Object Segmentation in Dynamic Image Collection using Attention Mechanism. Dhirubhai Ambani Institute of Information and Communication Technology. vii, 33 p. (Acc. # T01013).
Abstract:	Semantic segmentation of image groups is a crucial task in computer vision that aims to identify shared objects in multiple images. This work presents a deep neural network framework that exhibits congruity between images, thereby cosegmenting common objects. The proposed network is an encoderdecoder network where the encoder extracts high level semantic feature descriptors and the decoder generates segmentation masks. The task of cosegmentation between the images is boosted by an attention mechanism that leverages semantic similarity between feature descriptors. This attention mechanism is responsible for understanding the correspondence between the features, thereby determining the shared objects. The resultant masks localize the shared foreground objects while suppressing everything else as background. We have explored multiple attention mechanisms in 2 image input setup and have extended the model that outper forms the others for dynamic image input setup. The term dynamic image connotes that varying number of images can be input to the model, simultaneously, and the result will be the segmentation of common object from all of the input images. The model is trained end to end on image group dataset generated from the PASCALVOC 2012 [7] dataset. The experiments are conducted on other benchmark datasets as well and we can infer superiority of our model from the results achieved. Moreover, an important advantage of the proposed model is that it runs in linear time as opposed to quadratic time complexity observed in most works.
URI:	http://drsr.daiict.ac.in//handle/123456789/1093
Appears in Collections:	M Tech Dissertations

Files in This Item:

File	Size	Format
202011019.pdf	812.29 kB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets