Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/573
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMitra, Suman K.
dc.contributor.authorVasani, Hiral
dc.date.accessioned2017-06-10T14:43:31Z
dc.date.available2017-06-10T14:43:31Z
dc.date.issued2015
dc.identifier.citationVasani, Hiral (2015). Text retrieval from the degraded document images. Dhirubhai Ambani Institute of Information and Communication Technology, vii, 38 p. (Acc.No: T00536)
dc.identifier.urihttp://drsr.daiict.ac.in/handle/123456789/573
dc.description.abstractImage binarization is used to obtain a black and white text document from a colored one. Basically, it can be taken as an image segmentation task that segments the text part from the background. Such a black and white document can be used in many applications, namely Optical Character Recognition (OCR). Text documents suffer from various types of degradations that make image binarization a challenging task. This thesis presents the work done to design a technique that segments text from the background. In this method, the document image is first darkened in order to enhance the text (foreground) in it. The text image is again processed separately so as to suppress the background. The two images so obtained are combined in such a way that the suppressed background is retained from the last image and enhanced text is used from the first image. Then this pre-processed image is binarized using an existing thresholding technique. The first binarized image is subjected to some post-processing in order to remove unwanted smaller components and other noise. The output image so obtained is compared to the ground truth results using some evaluation parameters. The results of the algorithm are compared to the existing Binarization techniques.
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.subjectText retrieval
dc.subjectInformation retrieval
dc.subjectDocument
dc.subjectText Extraction
dc.subjectInformation Retrieval
dc.subjectTechniques
dc.classification.ddc006.35 VAS
dc.titleText retrieval from the degraded document images
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201311042
dc.accession.numberT00536
Appears in Collections:M Tech Dissertations

Files in This Item:
File Description SizeFormat 
201311042.pdf
  Restricted Access
1.81 MBAdobe PDFThumbnail
View/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.