Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/573
Title: Text retrieval from the degraded document images
Authors: Mitra, Suman K.
Vasani, Hiral
Keywords: Text retrieval
Information retrieval
Document
Text Extraction
Information Retrieval
Techniques
Issue Date: 2015
Publisher: Dhirubhai Ambani Institute of Information and Communication Technology
Citation: Vasani, Hiral (2015). Text retrieval from the degraded document images. Dhirubhai Ambani Institute of Information and Communication Technology, vii, 38 p. (Acc.No: T00536)
Abstract: Image binarization is used to obtain a black and white text document from a colored one. Basically, it can be taken as an image segmentation task that segments the text part from the background. Such a black and white document can be used in many applications, namely Optical Character Recognition (OCR). Text documents suffer from various types of degradations that make image binarization a challenging task. This thesis presents the work done to design a technique that segments text from the background. In this method, the document image is first darkened in order to enhance the text (foreground) in it. The text image is again processed separately so as to suppress the background. The two images so obtained are combined in such a way that the suppressed background is retained from the last image and enhanced text is used from the first image. Then this pre-processed image is binarized using an existing thresholding technique. The first binarized image is subjected to some post-processing in order to remove unwanted smaller components and other noise. The output image so obtained is compared to the ground truth results using some evaluation parameters. The results of the algorithm are compared to the existing Binarization techniques.
URI: http://drsr.daiict.ac.in/handle/123456789/573
Appears in Collections:M Tech Dissertations

Files in This Item:
File Description SizeFormat 
201311042.pdf
  Restricted Access
1.81 MBAdobe PDFThumbnail
View/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.