Learning cross domain relations using deep learning

dc.accession.numberT00740
dc.classification.ddc006.31 KOT
dc.contributor.advisorJoshi, Manjunath V.
dc.contributor.authorKotecha, Dhara
dc.date.accessioned2019-03-19T09:30:59Z
dc.date.accessioned2025-06-28T10:25:32Z
dc.date.available2019-03-19T09:30:59Z
dc.date.issued2018
dc.degreeM. Tech
dc.description.abstractThe Generative Adversarial Networks (GAN) have achieved exemplary performance in generating realistic images. They also perform image to image translation and produce good results for the same. In this thesis, we explore the use of GAN for performing cross domain image mapping for facial expression transfer. In facial expression transfer, the expressions of source image is transferred on the target image. We use a DiscoGAN (Discovery GAN) model for the task. Using a DiscoGAN, image of the target is generated with the facial features of the source. It uses feature matching loss along with the GAN objective and reconstruction loss. We propose a method to train the DiscoGAN with paired data of source and target images. In order to learn cross domain image mapping, we train the DiscoGAN with a batch size of 1. In our next work, we propose an algorithm to binarize the degraded document images in this thesis. We incorporate U-Net for the task at hand. We model document image binarization as a classification problem wherein we generate an image which is a result of classification of each pixel as text or background. Optimizing the cross entropy loss function, we translate the input degraded image to the corresponding binarized image. Our approach of using U-Net ensures low level feature transfer from the input degraded image to the output binarized image and thus it is better than using a simple convolution neural network. Our method of training leads to the desired results faster when both the degraded document and the ground truth binarized images are available for training and it also generalizes well. The results obtained are significantly better than the state-of-theart techniques and the approach is simpler than other deep learning approaches for document image binarization.
dc.identifier.citationKotecha, Dhara (2018). Learning Cross Domain Relations Using Deep Learning. Dhirubhai Ambani Institute of Information and Communication Technology, viii, 49 p. (Acc. No: T00740)
dc.identifier.urihttp://drsr.daiict.ac.in/handle/123456789/774
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.student.id201611058
dc.subjectMachine learning
dc.subjectNeural network
dc.subjectImage processing
dc.titleLearning cross domain relations using deep learning
dc.typeDissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
201611058_Dhara Kotecha.pdf
Size:
14.56 MB
Format:
Adobe Portable Document Format
Description:
201611058