[BangPypers] Help in reading the pdf file

Anand Balachandran Pillai abpillai at gmail.com
Sat Apr 4 08:03:19 CEST 2009


On Fri, Apr 3, 2009 at 8:20 PM, Sridhar Ratnakumar
<sridhar.ratna at gmail.com>wrote:

> On 3/26/09 3:29 PM, M Kumar wrote:
>
>> I need to read one pdf file and extract data from it. Is there any one can
>> guide me
>>
> pyPdf?
>
>  http://pybrary.net/pyPdf/


To give my $0.02, I had an opportunity to use both pyPdf and PDFMiner
for an open source project to measure accessibility of PDF documents.
I initially wrote the library using PDFMiner, but found that it had a higher
failure rate in reading documents, especially large ones when compared
to pyPdf. So, I rewrote the library using pyPdf and the experience was
better. Also I noted pyPdf works better on encrypted documents when
compared to PDFMiner.

pyPdf is not perfect and has a few issues which I faced when
reading certain encrypted documents. However, if your PDF files
are mostly non-encrypted, I would suggest pyPdf a better
choice than PDFMiner.



> <http://pybrary.net/pyPdf/>
>
> There is also reportlab toolkit
>
>  http://www.reportlab.org/rl_toolkit.html
>
>
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>



-- 
-Anand
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/bangpypers/attachments/20090404/4a93e774/attachment.htm>


More information about the BangPypers mailing list