Copy Protected PDFs and PIL

Brett Bowman bnbowman at gmail.com
Fri Nov 12 15:00:55 EST 2010


To answer various question:

MRAB -
I've tried worker threads, and it kills the thread only and not the program
as a whole.  I could use that as a work-around, but I would prefer something
more direct, in case other problems arise.

Steve Holden -
A traceback sounds like a great idea, but I don't know how to go about it,
or know what is involved.  Could you suggest a tutorial I could follow?

Emile van Sebille -
a Try/Except block was the first thing I tried, and it still dies with a
fatal error, even if I use a generic Except

Robert Kern -
A whoops, good catch.  I meant to say gfx and swftools.  I'm using PIL to
modify the images once I get a PNG from swftools, and I mis-spoke.
The code in question is:

        import gfx
        print "1"
        doc = gfx.open("pdf", MY_FILE)
        print "2"
        page1 = doc.getPage(1)
        print "3"
        g_img = gfx.ImageList()
        print "4"
        g_img.startpage(a_page.width,a_page.height)
        print "5"
        a_page.render(g_img)
        print "6"
        g_img.endpage()
        print "7"
        g_img.save(TEMP_PNG)

which prints the following:

        1
        2
        3
        4
        5
        FATAL PDF disallows copying

Any help or suggestions would be appreciated.

/b/

On Thu, Nov 11, 2010 at 12:28 PM, Brett Bowman <bnbowman at gmail.com> wrote:

> I'm trying to parse some basic details and a thumbnail from ~12,000 PDFs
> for my company, but a few hundred of them are copy protected.  To make
> matters worse, I can't seem to trap the error it causes: whenever it happens
> PIL throws a "FATAL PDF disallows copying" message and dies.  An automated
> way to snap a picture of the PDFs would be ideal, but I'd settle for a way
> to skip over them without crashing my program.
>
> Any tips?
>
> Brett Bowman
> Bioinformatics Associate
> Cibus LLC
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20101112/956b7777/attachment.html>


More information about the Python-list mailing list