Image captioning and neural architecture search using reinforcement learning

Shaw, Grishma

dc.contributor.advisor	Joshi, M.V.
dc.contributor.author	Shaw, Grishma
dc.date.accessioned	2020-09-14T06:00:37Z
dc.date.available	2020-09-14T06:00:37Z
dc.date.issued	2019
dc.identifier.citation	Shaw, Grishma (2019). Image captioning and neural architecture search using reinforcement learning. Dhirubhai Ambani Institute of Information and Communication Technology, xiv, 186p. (Acc.No: T00814)
dc.identifier.uri	http://drsr.daiict.ac.in//handle/123456789/850
dc.description.abstract	With the advent of Deep Learning, problem solving expertise for a machine has exponentially increased. The past decade has experienced much success in the field of deep neural networks in many difficult areas such as image, speech, machine translation and natural language understanding. A primary goal of computer vision is to automatically produce descriptive captions for an image that is fairly close to the essence of scene understanding. Therefore, the image captioning model must be powerful enough to capture the entire content of an image as well as convey their correlation in a common language. Inspired by the challenging task of image captioning, we attempt to solve it using attention mechanism with the help of reinforcement learning as the first part of the thesis. Reinforcement learning (RL) is a machine learning technique dealing with the manner in which a software agent should react to an environment so as to maximise the idea of cumulative reward. This technique best fits for the purpose of decision making. To develop a neural network model, it requires meaningful architecture engineering. One may get it by transfer learning, but to achieve the best possible performance it is usually preferred to design network from scratch which requires specialised skills and is challenging in general. Neural Architecture Search (NAS) is a technique that hunts for the finest neural network architecture. To build a network for the first problem automatically, we attempt to implement NAS using RL on an elementary problem of digit classification as the second part of the work.
dc.publisher	Dhirubhai Ambani Institute of Information and Communication Technology
dc.subject	Reinforcement learning
dc.subject	machine learning technique
dc.subject	neural architecture search
dc.classification.ddc	006.31 SHA
dc.title	Image captioning and neural architecture search using reinforcement learning
dc.type	Dissertation
dc.degree	M.Tech
dc.student.id	201711065
dc.accession.number	T00815

Files in this item

Name:: 201711065.pdf
Size:: 5.147Mb
Format:: PDF
Description:: Dissertation

View/Open

This item appears in the following Collection(s)

M Tech Dissertations [923]

Show simple item record