dc.contributor.advisor | Hati, Avik | |
dc.contributor.author | Baid, Sana | |
dc.date.accessioned | 2024-08-22T05:21:01Z | |
dc.date.available | 2024-08-22T05:21:01Z | |
dc.date.issued | 2022 | |
dc.identifier.citation | Baid, Sana (2022). Common Object Segmentation in Dynamic Image Collection using Attention Mechanism. Dhirubhai Ambani Institute of Information and Communication Technology. vii, 33 p. (Acc. # T01013). | |
dc.identifier.uri | http://drsr.daiict.ac.in//handle/123456789/1093 | |
dc.description.abstract | Semantic segmentation of image groups is a crucial task in computer vision that aims to identify shared objects in multiple images. This work presents a deep neural network framework that exhibits congruity between images, thereby cosegmenting common objects. The proposed network is an encoderdecoder network where the encoder extracts high level semantic feature descriptors and the decoder generates segmentation masks. The task of cosegmentation between the images is boosted by an attention mechanism that leverages semantic similarity between feature descriptors. This attention mechanism is responsible for understanding the correspondence between the features, thereby determining the shared objects. The resultant masks localize the shared foreground objects while suppressing everything else as background. We have explored multiple attention mechanisms in 2 image input setup and have extended the model that outper forms the others for dynamic image input setup. The term dynamic image connotes that varying number of images can be input to the model, simultaneously, and the result will be the segmentation of common object from all of the input images. The model is trained end to end on image group dataset generated from the PASCALVOC 2012 [7] dataset. The experiments are conducted on other benchmark datasets as well and we can infer superiority of our model from the results achieved. Moreover, an important advantage of the proposed model is that it runs in linear time as opposed to quadratic time complexity observed in most works. | |
dc.publisher | Dhirubhai Ambani Institute of Information and Communication Technology | |
dc.subject | Semantic segmentation | |
dc.subject | Computer vision | |
dc.subject | Dynamic image | |
dc.subject | PASCALVOC | |
dc.classification.ddc | 005.72 BAI | |
dc.title | Common Object Segmentation in Dynamic Image Collection using Attention Mechanism | |
dc.type | Dissertation | |
dc.degree | M. Tech | |
dc.student.id | 202011019 | |
dc.accession.number | T01013 | |