[ANN] PyTesser: Optical Character Recognition (v0.0.1)

Michael J.T. O'Kelly mokelly at MIT.EDU
Wed May 16 05:52:05 CEST 2007


PyTesser version 0.0.1 is available at http://code.google.com/p/pytesser/

What is PyTesser?
==============
PyTesser is an Optical Character Recognition module for Python. It takes 
as input an image or image file containing text and outputs a string.

PyTesser uses the Tesseract OCR engine (an Open Source project at 
Google), converting images to an accepted format and calling the 
Tesseract executable as an external script.  A Windows executable is 
provided along with the Python scripts. The scripts should work in other 
operating systems as well.

Usage
=========
 >>> from pytesser import *
 >>> image = Image.open('fnord.tif')  # Open image object using PIL
 >>> print image_to_string(image)     # Run tesseract executable on image
fnord
 >>> print image_file_to_string('fnord.tif')
fnord

License
===========
PyTesser is released under the Apache License 2.0.


===========
Michael J.T. O'Kelly
<mokelly at mit.edu>
http://mjtokelly.blogspot.com/



<P><A HREF="http://code.google.com/p/pytesser/">PyTesser v0.0.1</A> - 
PyTesser is an Optical Character Recognition module for Python using the 
Tesseract OCR engine.  (15-May-07)


More information about the Python-announce-list mailing list