Optical character recognition (OCR) feature extraction and classification
Abstract
Optical character recognition (OCR) [6] is a process of digitizing an image or document containing text. In the OCR system, we do the classification of optical patterns contained in a digital image corresponding to alphanumeric and special characters. The various important intermediate steps involved in character recognition are pre-processing, segmentation, feature extraction and classification/recognition. In the past, a lot of research has been performed to compare the performance of various OCR approaches such as Support Vector Machine (SVM) [2], Hidden Markov Model (HMM) [7], Feed Forward Neural Networks [8] and Convolutional Neural Networks [9] and even Transfer Learning [3]. We have proposed to use Capsule Network [5] to improve the Optical Character Recognition performance. For this thesis, we are taking up this problem to make it more robust for various type of documents and fonts. Also, we want to overcome erroneous predictions in case of incorrect segmentation of characters. This retains most of the important information in the document which can be used later for various pipeline processes. Our approach makes the manual correction of OCR-ed output as less as possible. The complete numeric value is of more importance and even a single error in the character (digit) will ask for the manual editors to type the complete numeric value again, so predicting the complete block of the numeric value ism very important for us. Keywords: Optical Character Recognition, Pre-processing, Segmentation, Feature Extraction
Collections
- M Tech Dissertations [923]