[Tutor] extracting informations (images and text) from a PDF and creating a database from it

Shashwat Anand anand.shashwat at gmail.com
Tue Dec 29 08:33:29 CET 2009

I need to make a database from some PDFs. I need to extract logos as well as
the information (i.e. name,address) beneath the logo and fill it up in
database. The logo can be text as well as picture as shown in two of the
screenshots of one of the sample PDF file:
Will converting to html  a good option? Later on I need to apply some image
processing too. What should be the ideal way towards it ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20091229/586bca21/attachment.htm>

More information about the Tutor mailing list