Video Object Detection and Identification in Dynamic Environment

Shah, Mahir Manishbhai

View/Open

202011002.pdf (1.417Mb)

Date

2022

Author

Shah, Mahir Manishbhai

Metadata

Show full item record

Abstract

Object Detection and Identification in the field of computer vision is widely regarded as one of the most difficult problems in computer science. Yet it is one of the most rising topics in recent years due to the advancement of the computer hardware technologies like GPUs.The task of Object Detection and Identification can be further divided into two categories: 1. Object Detection and Identification in still images. 2. Object Detection and Identification in dynamic environments. Due to the advancements in computer hardware like GPUs, deep neural network based methods have shown great accuracy and most of the state-of-the-art methods for still images are based on deep neural networks. Extending these state-of-the-art object detectors for still images into dynamic environments is not easy as we see a drop in accuracy because of the deteriorated object appearances like rare poses, motion blurs, video defocus, and part or full occlusion. The reason for the decrease in accuracy is that still image detectors do not take into account the temporal information contained in videos when detecting the objects in dynamic environment like videos. To improve the accuracy of the state-of-the-art detectors in the dynamic environment like videos, different methods have been developed which takes into consideration temporal information present in videos.In this thesis work, we have tried to increase the accuracy of state-of-the-art object detectors by trying to use the knowledge of the previously trained model as a reference to another model. In our work, we have also tried to simplify the architecture when we combine two different models without incurring a loss in the accuracy. In our thesis work, the first model that we have trained is an pixel-level method and the second model that we have trained is an instancelevel method. We have tested our approach on the ImageNet VID dataset and YouTube-8M dataset. Results show that our approach has obtained improved results in instance-level object detection methods.

URI

http://drsr.daiict.ac.in//handle/123456789/1084

Collections

M Tech Dissertations [923]