[Tutor] Converting to PDF

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed Jan 7 20:40:00 EST 2004



On Wed, 7 Jan 2004, [ISO-8859-1] "H=E9ctor Villafuerte D." wrote:

> Hi all,
> I'm trying to generate PDFs with this little script:
>
> -----SCRIPT-------------------------
> import fileinput, os, Image
>
> def get_calls(dir, file):
>     im =3D Image.open(dir + file)
>     im.save(dir + '_' + file, 'PDF')
>
> if __name__ =3D=3D '__main__':
>     dir =3D 'c:\\tmp\\scan\\'
>     for x in os.listdir(dir):
>         if x.split('.')[1] =3D=3D 'jpg':
>             print x
>             get_calls(dir, x)


Hi Hector,


I looked into the PdfImagePlugin.py source from version 1.1.4 of the
Python Imaging Library,

    http://www.pythonware.com/products/pil/


And you appear to have run into a bug in PIL!  The code in question in
PIL/PdfImagePlugin.py is:

###
    if filter =3D=3D "/ASCIIHexDecode":
        ImageFile._save(im, op, [("hex", (0,0)+im.size, 0, None)])
    elif filter =3D=3D "/DCTDecode":
        ImageFile._save(im, op, [("jpeg", (0,0)+im.size, 0, im.mode)])
    elif filter =3D=3D "/FlateDecode":
        ImageFile._save(im, op, [("zip", (0,0)+im.size, 0, im.mode)])
    elif filter =3D=3D "/RunLengthDecode":
        ImageFile._save(im, op, [("packbits", (0,0)+im.size, 0, im.mode)])
    else:
        raise ValueError, "unsupported PDF filter"
###


That block is within the _safe() function in PdfImagePlugin.py.  But
there's only one block of code that assigns to this 'filter' variable:

###
    if im.mode =3D=3D "1":
        filter =3D "/ASCIIHexDecode"
        config =3D "/DeviceGray", "/ImageB", 1
    elif im.mode =3D=3D "L":
        filter =3D "/DctDecode"
        # params =3D "<< /Predictor 15 /Columns %d >>" % (width-2)
        config =3D "/DeviceGray", "/ImageB", 8
    elif im.mode =3D=3D "P":
        filter =3D "/ASCIIHexDecode"
        config =3D "/Indexed", "/ImageI", 8
    elif im.mode =3D=3D "RGB":
        filter =3D "/DCTDecode"
        config =3D "/DeviceRGB", "/ImageC", 8
    elif im.mode =3D=3D "CMYK":
        filter =3D "/DCTDecode"
        config =3D "/DeviceRGB", "/ImageC", 8
    else:
        raise ValueError, "illegal mode"
###


'filter' here has three possible values:

    ['/ASCIIHexDecode',
     '/DctDecode',
     '/DCTDecode']


Notice the case difference here between '/DctDecode' and '/DCTDecode'.
Python is case sensitive.  This is a bad sign.  *grin*


The block that we're running into problems with, the one that throws the
exception, checks for:

    ["/ASCIIHexDecode",
     "/DCTDecode",
     "/FlateDecode",
     "/RunLengthDecode"]


It doesn't handle "/DctDecode"!  Furthermore, there's no way 'filter' can
be '/RunLengthDecode', so there's some dead code here too.

There's definitely a bug here.  Send a holler out to the Python Imaging
Library folks.  *grin*

    http://mail.python.org/mailman/listinfo/image-sig

and get them to fix it so no one else runs into this.



In the meantime, my best guess right now to fix the bug is to modify
PdfImagePlugin.py and switch over the '/DctDecode' string to '/DCTDecode'
and see if that clears up the problem.  Unfortunately, I can't test this
hypothesis without a sample JPEG image.





By the way: you may want use os.path.join() to join together the directory
name with the file name.  The code in get_calls():

> def get_calls(dir, file):
>     im =3D Image.open(dir + file)
>     im.save(dir + '_' + file, 'PDF')

may not get the correct name of the directory, since the path separator
might be missing.

I can see that you have hardcoded the path separator embedded in the 'dir'
directory name within your main program, but if you are using get_calls()
from another program, then there's no guarantee that the directory name
ends with the path separator character.



Hope this helps!




More information about the Tutor mailing list