Repository logo
Collections
Browse
Statistics
  • English
  • हिंदी
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Theses and Dissertations
  3. M Tech Dissertations
  4. Optical character recognition (OCR) feature extraction and classification

Optical character recognition (OCR) feature extraction and classification

Files

  • 201711004.pdf
    13.7 MB

Date

2019

Authors

Prajapati, Pratik Kamlesh

Journal Title

Journal ISSN

Volume Title

Publisher

Dhirubhai Ambani Institute of Information and Communication Technology

Abstract

Optical character recognition (OCR) [6] is a process of digitizing an image or document containing text. In the OCR system, we do the classification of optical patterns contained in a digital image corresponding to alphanumeric and special characters. The various important intermediate steps involved in character recognition are pre-processing, segmentation, feature extraction and classification/recognition. In the past, a lot of research has been performed to compare the performance of various OCR approaches such as Support Vector Machine (SVM) [2], Hidden Markov Model (HMM) [7], Feed Forward Neural Networks [8] and Convolutional Neural Networks [9] and even Transfer Learning [3]. We have proposed to use Capsule Network [5] to improve the Optical Character Recognition performance. For this thesis, we are taking up this problem to make it more robust for various type of documents and fonts. Also, we want to overcome erroneous predictions in case of incorrect segmentation of characters. This retains most of the important information in the document which can be used later for various pipeline processes. Our approach makes the manual correction of OCR-ed output as less as possible. The complete numeric value is of more importance and even a single error in the character (digit) will ask for the manual editors to type the complete numeric value again, so predicting the complete block of the numeric value ism very important for us. Keywords: Optical Character Recognition, Pre-processing, Segmentation, Feature Extraction

Description

Keywords

Optical character recognition, pre-processing, segmentation, feature extraction

Citation

Prajapati, Pratik Kamlesh (2019). Optical character recognition (OCR) feature extraction and classification. Dhirubhai Ambani Institute of Information and Communication Technology, 47p. (Acc.No: T00763)

URI

http://ir.daiict.ac.in/handle/123456789/828

Collections

M Tech Dissertations

Endorsement

Review

Supplemented By

Referenced By

Full item page
 
Dhirubhai Ambani University
Quick Links
  • Home
  • Search Repository
  • Research Overview
  • About Us
Contact Us
  • DAU, Gandhinagar, Gujarat, India
  • library@dau.ac.in
  • +91 0796-8261-578

© 2025 Dhirubhai Ambani University. All rights reserved.
Designed by Library Team