Logo - springer
Slogan - springer

New & Forthcoming Titles | Guide to OCR for Indic Scripts - Document Recognition and Retrieval

Guide to OCR for Indic Scripts

Document Recognition and Retrieval

Govindaraju, Venu, Setlur, Srirangaraj (Ranga) (Eds.)


Available Formats:

Springer eBooks may be purchased by end-customers only and are sold without copy protection (DRM free). Instead, all eBooks include personalized watermarks. This means you can read the Springer eBooks across numerous devices such as Laptops, eReaders, and tablets.

You can pay for Springer eBooks with Visa, Mastercard, American Express or Paypal.

After the purchase you can directly download the eBook file or read it online in our Springer eBook Reader. Furthermore your eBook will be stored in your MySpringer account. So you can always re-download your eBooks.


(net) price for USA

ISBN 978-1-84800-330-9

digitally watermarked, no DRM

Included Format: PDF and EPUB

download immediately after purchase

learn more about Springer eBooks

add to marked items


Hardcover version

You can pay for Springer Books with Visa, Mastercard, American Express or Paypal.

Standard shipping is free of charge for individual customers.


(net) price for USA

ISBN 978-1-84800-329-3

free shipping for individuals worldwide

usually dispatched within 3 to 5 business days

add to marked items


Softcover (also known as softback) version.

You can pay for Springer Books with Visa, Mastercard, American Express or Paypal.

Standard shipping is free of charge for individual customers.


(net) price for USA

ISBN 978-1-4471-2518-1

free shipping for individuals worldwide

usually dispatched within 3 to 5 business days

add to marked items

  • First comprehensive book on the topic of Indic Script OCRs

Optical Character Recognition (OCR) is a key enabling technology critical to creating indexed, digital library content, and it is especially valuable for Indic scripts, for which there has been very little digital access.

Indic scripts, the ancient Brahmi scripts prevalent in the Indian subcontinent, present some challenges for OCR that are different from those faced with Latin and Oriental scripts. But properly utilized, OCR will help to make Indic digital archives practically accessible to researchers and lay users alike by creating searchable indexes and machine-readable text repositories.

This unique guide/reference is the very first comprehensive book on the subject of OCR for Indic scripts, providing an overview of the state-of-the-art research in this field as well as other issues related to facilitating query and retrieval of Indic documents from digital libraries. All major research groups working in this area are represented in this book, which is divided into sections on recognition of Indic scripts and retrieval of Indic documents.

Topics and features:

  • Contains contributions from the leading researchers in the field
  • Discusses data set creation for OCR development
  • Describes OCR systems that cover eight different scripts: Bangla, Devanagari, Gurmukhi, Gujarati, Kannada, Malayalam, Tamil, and Urdu (Perso-Arabic)
  • Explores the challenges of Indic script handwriting recognition in the online domain
  • Examines the development of handwriting-based text input systems
  • Describes ongoing work to increase access to Indian cultural heritage materials
  • Provides a section on the enhancement of text and images obtained from historical Indic palm leaf manuscripts
  • Investigates different techniques for word spotting in Indic scripts
  • Reviews mono-lingual and cross-lingual information retrieval in Indic languages

This is an excellent reference for researchers and graduate students studying OCR technology and methodologies. This volume will contribute to opening up the rich Indian cultural heritage embodied in millions of ancient and contemporary documents spanning topics such as science, literature, medicine, astronomy, mathematics and philosophy.

Venu Govindaraju FIEEE FIAPR, is a Distinguished Professor of Computer Science and Engineering at the University at Buffalo. He has over 20 years of research experience in pattern recognition, information retrieval and biometrics. His seminal work on handwriting recognition was at the core of the first handwritten address interpretation system used by the U.S. Postal Service.

Srirangaraj Setlur SMIEEE, is a Principal Research Scientist at the University at Buffalo. He has over 15 years of research experience in pattern recognition that includes NSF sponsored work on multilingual OCR technologies for digital libraries and other applications. His work on postal automation has led to technology adopted by the U.S. Postal Service, and Royal Mail in the U.K.

Content Level » Research

Keywords » Digital Libraries - Document Retrieval - Indic Scripts - OCR - Text Recognition - handwriting recognition

Table of contents 

Part I: Recognition of Indic Scripts Building Data Sets for Indian Language OCR Research C. V. Jawahar, Anand Kumar, A. Phaneendra and K.J. Jinesh On OCR of major Indian scripts: Bangla and Devanagari B. B. Chaudhari A Complete Machine Printed Gurmukhi OCR System Gurpreet Singh Lehal Progress in Gujarati Document Processing and Character Recognition Jignesh Dholakia, Atul Negi and S. Rama Mohan Design of a bilingual Kannada-English OCR R. S. Umesh , P. B. Pati and A. G. Ramakrishnan Recognition of Malayalam Documents N. V. Neeba , Anoop Namboodiri, C. V. Jawahar and P. J. Narayanan A Complete OCR System for Tamil Magazine Documents K. H. Aparna and V. S. Chakravarthy Experiments on Urdu Text Recognition Omar Mukhtar, Srirangaraj Setlur and Venu Govindaraju The BBN Byblos Hindi OCR System Prem Natarajan, Ehry MacRostie, and Michael Decerbo Generalization of Hindi OCR using Adaptive Segmentation and Font Files Mudit Agrawal, Huanfeng Ma and David Doermann Online Handwriting Recognition for Indic Scripts A. Bharath and Sriganesh Madhvanath Part II: Retrieval of Indic Documents Enhancing Access to Primary Cultural Heritage Materials of India Peter M. Scharf and Malcolm Hyman Digital Image Enhancement of Indic Historical Manuscripts Zhixin Shi, Srirangaraj Setlur and Venu Govindaraju GFG based Compression and Retrieval of Document Images in Indian Scripts Gaurav Harit, Shantanu Chaudhary and Ritu Garg Word spotting for Indic documents to facilitate retrieval Anurag Bhardwaj, Srirangaraj Setlur, Venu Govindaraju Indian Language Information Retrieval Prasenjit Majumder and Mandar Mitra

Popular Content within this publication 



Read this Book on Springerlink

Services for this book

New Book Alert

Get alerted on new Springer publications in the subject area of Document Preparation and Text Processing.