المستخلص: |
Optical Character Recognition (OCR) is a piece of software that converts printed text and images into digitized form such that it can be manipulated by machine. Unlike human brain which has the capability to very easily recognize the text/ characters from an image, machines are not intelligent enough to perceive the information available in image. Methodology this paper discusses different types of OCR systems have emerged as a result of multitude of directions in which research on OCR has been carried out during past years. In this paper, an overview of various techniques of OCR has been presented. An OCR is not an atomic process but comprises various phases such as acquisition, pre- processing, segmentation, feature extraction, classification and post-processing. Each of the steps is discussed in detail in this paper. Using a combination of these techniques, an efficient OCR system can be developed as a future work. The OCR system can also be used in different practical applications such as number-plate recognition, smart libraries and various other real-time applications. Despite of the significant amount of research in OCR, recognition of characters for language such as Arabic, Sindhi and Urdu still remains an open challenge.
|