[Image-SIG] Corruption writing PNG data

Eric Soroos eric at soroos.net
Tue Aug 20 07:16:52 CEST 2013


Hi Ben.

(mirrored to the list, who knows when it'll get delivered)

I'm a maintainer for Pillow, an actively maintained successor for PIL 
that has some bugfixes and new features.
I've taken a quick look, and it appears that the bug is in Pillow as well.

It appears that at some point, once the file gets over about 23 mp, or 
at 8285x2780, there's a failure. It works at 8284x2780, and also if the 
subselected region is panned. I've traced it down to the calls to zlib 
-- Pillow/Pil is calling it to compress the png formatted scanlines, and 
somewhere in there (but not at the end of the image) the compressed 
datastream is getting corrupted.

However, there is a small workaround that appears to help. If a 
compress_level is passed into the save call, requesting anything from 0 
(no compression) to 9 (max), it saves properly. Passing in -1 (the 
default compression level, equivalent to 6) triggers the failure.

I've been testing it with the following:
(if maxblock is big enough, you get all of the image data in one IDAT 
block, and we can test the decompression of it with the plain zlib 
decompression call. It's commented out here.)


from PIL import Image, ImageFile
import zlib

npz = numpy.load("data.npz")
imagedata = npz['arr_0']
palette = npz['arr_1']

Image.DEBUG = 1
#ImageFile.MAXBLOCK = 512*1024

print(imagedata.shape)

def test(p):
     i1 = imagedata[0:2780,0:p]
     im = Image.fromarray(i1, 'P')
     im.putpalette(palette)
     print (im)
     im.save('tmp.png',compress_level=9 )
     im2 = Image.open('tmp.png')
     print (im2)
     print ("Verify: %s" %im2.verify())
     try:
         im2 = Image.open('tmp.png')
         ImageFile.LOAD_TRUNCATED_IMAGES
         im2.load()
         print ("%s success" %p)
     except:
         print ("FAIL: %s" %p)
         #raise

     #with open('tmp.png','rb') as f:
     #    f.seek(821)#

     #    s = zlib.decompress(f.read(385128))
     #    print ("successful decompress")


test(8284)
test(8285)

The error I'm getting: "zlib.error: Error -3 while decompressing data: 
invalid distances set" would indicate that the compressed data stream is 
either corrupt or was compressed with different settings, perhaps a 
larger compression window.

I'm hoping to find the actual bug here, rather than a vague workaround.

eric


On 07/26/2013 04:55 AM, Ben Taylor wrote:
> Hi all
>
> I've got some data that should be fine (came from a known-good netCDF
> file) but causes PIL to write an invalid image if it is used to save in
> PNG format (it's happy writing GIF format).
>
> The same code writes PNGs quite happily 99% of the time, but just
> occasionally we generate a netCDF that causes this bug - we don't know
> why. We're not sure if the problem is actually in PIL or if it might be
> in libpng. Anyone mind taking a look please?
>
> Test data at ftp://rsg.pml.ac.uk/rsg/benj/pil_problem/:
> data.npz - Numpy npz data with data that causes the issue
> working_data.npz - npz file from another source that works fine
> test_harness.py - Run this with the two test files to demonstrate the
> problem - the PNG generated from data.npz is corrupt.
>
> Test harness code below (just to demonstrate it's nothing complicated).
>
> TIA
> Ben
>
> #!/usr/bin/env python
>
> import numpy
> import Image
>
> def test(infile, prefix):
>
>     npz = numpy.load(infile)
>     imagedata = npz['arr_0']
>     palette = npz['arr_1']
>
>     out_image = Image.fromarray(imagedata, 'P')
>     out_image.putpalette(palette)
>
>     out_image.save(prefix + ".gif") # ok
>     out_image.save(prefix + ".png") # bust
> # end function
>
> test("data.npz", "broken")
> test("working_data.npz", "working")
>



More information about the Image-SIG mailing list