Menu

22 February 2011

join and work for Indian Tesseract OCR (scanned images to text )

--------- Forwarded message ----------
From: M.N.S.Rao <mnsrao AT gmail.com>
Date: Tue, Feb 22, 2011 at 11:26 AM

Hello group members,

Kannada is not having an OCR program to convert scanned images to text to further edit and this lack is making people to type full text for editing purposes. The advantages of OCR program need not be detailed for s/w people, but its requirement and use in enriching the language by putting the large quantity of the literature in text format on the web cannot be exaggerated. 

Having said the above I want to point out that there is a ray of hope as a free s/w Tesseract (http://code.google.com/p/tesseract-ocr/) is a program which can be trained for any language. This feature has to be exploited and a service to Kannada can be rendered by those who are knowledgeable in s/w and love their language.

There is a very small group which is doing this work in bits. This group would like to invite more people to join.

Thanks,

MNS Rao

 

By- Narendra Sisodiya

Post a Comment