Please use this identifier to cite or link to this item:
http://drsr.daiict.ac.in//handle/123456789/573
Title: | Text retrieval from the degraded document images |
Authors: | Mitra, Suman K. Vasani, Hiral |
Keywords: | Text retrieval Information retrieval Document Text Extraction Information Retrieval Techniques |
Issue Date: | 2015 |
Publisher: | Dhirubhai Ambani Institute of Information and Communication Technology |
Citation: | Vasani, Hiral (2015). Text retrieval from the degraded document images. Dhirubhai Ambani Institute of Information and Communication Technology, vii, 38 p. (Acc.No: T00536) |
Abstract: | Image binarization is used to obtain a black and white text document from a colored one. Basically, it can be taken as an image segmentation task that segments the text part from the background. Such a black and white document can be used in many applications, namely Optical Character Recognition (OCR). Text documents suffer from various types of degradations that make image binarization a challenging task. This thesis presents the work done to design a technique that segments text from the background. In this method, the document image is first darkened in order to enhance the text (foreground) in it. The text image is again processed separately so as to suppress the background. The two images so obtained are combined in such a way that the suppressed background is retained from the last image and enhanced text is used from the first image. Then this pre-processed image is binarized using an existing thresholding technique. The first binarized image is subjected to some post-processing in order to remove unwanted smaller components and other noise. The output image so obtained is compared to the ground truth results using some evaluation parameters. The results of the algorithm are compared to the existing Binarization techniques. |
URI: | http://drsr.daiict.ac.in/handle/123456789/573 |
Appears in Collections: | M Tech Dissertations |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
201311042.pdf Restricted Access | 1.81 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.