Optical Character Recognition

  • Optical Character Recognition(OCR) is a technology used to convert scanned documents or text containing images into editable text. It requires a hardware to scan the document and a software to recognize text from the image and convert it into an editable file like text file or Word file. The software used is called OCR Software. Optical Character Recognition is now widely being used in digitizing books or other printed documents so as to make them easily searchable.

    On November 10, 2009, Intel released an e-book reader for visually impaired, called Intel Reader, which uses Optical character Recognition.

  • How an OCR software works

    The technology used in OCR softwares to recognize text from images is quite complex. To understand the process, look at the below example of how ABBYY FineReader works.

    • "First, the program analyzes the structure of document image. It divides the page into elements such as blocks of texts, tables, images, etc. The lines are divided into words and then - into characters. Once the characters have been singled out, the program compares them with a set of pattern images. It advances numerous hypotheses about what this character is. Basing on these hypotheses the program analyzes different variants of breaking of lines into words and words into characters. After processing huge number of such probabilistic hypotheses, the program finally takes the decision, presenting you the recognized text."http://finereader.abbyy.com/about_OCR/whatis_ocr
  • Popular OCR Softwares

    Commercial:

    1. ABBYY FineReader
    2. OmniPage
    3. Readiris

    Free:

    1. Simple OCR
    2. TopOCR
    3. FreeOCR

    Free Online:

    1. ocrterminal.com
    2. free-ocr.com
    3. onlineocr.net

    Open Source OCR Projects:

    1. Tesseract
    2. GOCR

About this page

  • Page Views
    0
What is this?

Page Manager

mahaazar
M$0.22
What is this?
This page currently has no vertical manager.