[Image-SIG] PIL, Python and memory issues.... HELP!

Fredrik Lundh fredrik@pythonware.com
Wed, 21 Apr 1999 12:15:59 +0200


Kevin wrote:
> Where are the memory bottlenecks likely to be:  in Python itself,
> or in PIL?  I'm finding that you basically need to be able to hold the
> entire image in RAM, plus OS, etc.  It doesn't seem to be effectively
> using the virtual memory on my Win98 or NT test machines, or there's
> an arbitrary memory limit somewhere in the code.

PIL is copying all data to an RAM image memory, except
under a few special circumstances (more on those later).
it allocates one byte per pixel for "1", "L", and "P" images,
and four bytes per pixel for all other formats.

so I suspect the problem is on the operating system side
(in my experience, most operating systems tend to give up
when a single process attempts to grow much larger than
the physical memory -- and to thrash heavily long before
that...)

> Anyone have suggestions for using Python/PIL with such large images,
> short of parsing the files pixel for pixel (or line by line)?  I'd hate to
> have to start over again in C, because Python is so convenient for this
> type of thing (if not amazingly fast).

well, here's a trick that might work for you.  try this:

>>> import Image
>>> i = Image.open(<your file>)
>>> i.size
(512, 512)
>>> i.tile
[('raw', (0, 0, 512, 512), 128, ('RGB', 0, 1))]

the tile attribute contains a list of "tile descriptors", which
are used to load the image from file.

-- the first descriptor item should be "raw" -- if it isn't, the file
is compressed, and can most likely not be read in pieces.

-- the second item is the tile extent (a rectangle)

-- the third item is the offset from the start of the file to the
data for that tile.

-- the final item is a list of arguments to the decoder.

the interesting thing is that you can change these parameters
just after you've opened the file.  for example, to read only the
first 128 lines of this file, do as follows:

>>> i.size = (512, 128)
>>> i.tile = [('raw', (0, 0, 512, 128), 128, ('RGB', 0, 1))]

>>> i.load()

(the call to load explicitly reads the data from disk)

to read the following 128 lines, you must open the
file again, and modify also the offset argument.

>>> i = Image.open(...)
>>> i.size = (512, 128)
>>> i.tile = [('raw', (0, 0, 512, 128), 128 + 128*(3*512), ('RGB', 0, 1))]

and so on.  writing a small loop to do this shouldn't
be that difficult.

...

David wrote:
> I also wonder if maybe a hack to PIL which used mmap() would be a
> solution which would make it VM-friendly.  If so, it might be easier
> than to recode it in C (and possible would speed up PIL for 'smaller'
> images as well. 

PIL already contains such a hack, which is enabled under
these circumstances:

    -- you're on win95 or winNT (haven't had time
    to finish the unix implementation yet, sorry).

    -- you're opening an image using the "raw"
    decoder.

    -- the image has an internal format which is
    compatible with PIL's internal pixel layout for
    that mode.

or in other words, the image must be "L", "P", "RGBX",
"RGBA", or "CMYK", and be stored in an uncompressed
format.

...

finally, I should mention that future versions of PIL (post
1.0) will include much better support for huge images,
among other things.  in that process, we're moving to a
more Ghostscript-like licensing strategy.  more on this
later.

Cheers /F