Segmentation and classification: optical character recognition algorithms to enhance the accuracy

Shah, Ketul Tejas

Segmentation and classification: optical character recognition algorithms to enhance the accuracy

Files

201711017.pdf (9.39 MB)

Date

2019

Authors

Shah, Ketul Tejas

Publisher

Dhirubhai Ambani Institute of Information and Communication Technology

Abstract

OCR (Optical Character Recognition) is a well known algorithm for the conversion of machine printed or handwritten text images to machine-encoded text. As the word "Optical" is related to light or vision, an OCR system takes input of images. These images are light intensities captures by camera/scanner sensors images in the form of matrices. Character recognition is defined as the process by which identification of the characters can be done algorithmically in the automated fashion. In the computer vision community, OCR is an interesting problem for the researchers. Previously, all the proposed methods on OCR were majorly rule based that cannot be generalised on a variety of datasets. With the advancement of the machine learning community, researchers started solving this problem using various supervised classification and deep learning based algorithms. The methods developed for OCR show inferior performance if the input data is noisy and generalised i.e. having multi language in single document or if the text has different fonts/sizes. Commercial OCR engines such as Abbyy [1] SDK and Tesseract [2] are available but accuracy of these engines for digit recognition with underlines and presence of table is inferior. An approach for table detection followed by preprocessing and recognition is proposed. After localizing and detecting table with the help of Faster R-CNN [3] architecture, the cropped region can be further processed with the binary classification (Text or Numeric class) followed by character level segmentation. Segmented characters can be given to CNN for classification of digits and special characters.

Keywords

Optical character recognition, convolutional neural network, regionl proposal network

Citation

Shah, Ketul Tejas (2019). Segmentation and classification: optical character recognition algorithms to enhance the accuracy. Dhirubhai Ambani Institute of Information and Communication Technology, 19p. (Acc.No: T00774)

URI

http://ir.daiict.ac.in/handle/123456789/854

Collections

M Tech Dissertations

Full item page

Segmentation and classification: optical character recognition algorithms to enhance the accuracy

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By