[Image-SIG] PIL and 16-bit image types -- an offer of help
zpincus at stanford.edu
Thu Nov 30 03:51:50 CET 2006
I am offering to implement fixes in PIL to make its handling of 16-
bit image types less vexing. Are the PIL maintainers interested in
such help? I have a few questions to ask, and then I could get to
work if some maintainer would give me a preliminary go-ahead.
THE FULL STORY:
I'm trying to use PIL for a scientific imaging library.
Unfortunately, many of the images I need to deal with are 16-bit
grayscale image files; an image mode that PIL does not deal well
with. Thus I will need to either write work-arounds in my code to fix
the problems, or fix PIL itself. I would prefer to do the latter.
I previously sent a patch to this list to allow for proper reading of
16-bit TIFF files; however this patch is a very simple-minded and I
think that with a few more changes, PIL could support 16-bit
grayscale files in a much better way. However there's a major
philosophical issue that needs to be addressed first!
Specifically, does mode 'I;16' mean '16-bit little-endian integers'
or '16-bit native integers'? In practice the code uses the former
definition; however, Imaging.h declares the latter to be the case.
The real issue is that all other multi-byte types like 'I' and 'F'
are stored as native byte order in memory, regardless of how they are
read in. Thus, both users and developers are abstracted from the
question of byte ordering. (Except that developers still need to care
about endian-ness at serialization time.)
However, 16-bit image types are not so insulated, which is what makes
them so vexing in PIL. This makes double the work for trying to add
16-bit compatibility to any image function, because you have to write
the compatibility twice, once for each endian-ness. Also, writing a
function to deal with a particular byte ordering which may or may not
be native is both error-prone and inefficient.
The real problem is just one of nomenclature. Pack.c and Unpack.c
make a distinction between 'raw modes' like 'I;32' (which implicitly
means 32-bit little-endian) and normal use-level 'modes' like
'I' (which means 32-bit native endian). However, the use of 'I;16' as
a user-level image mode has clouded issues, because even at the user
level it means 'little endian'. These subtle differences in meaning
cause a lot of the 16-bit manipulation bugs that I've seen in PIL so
I think that Imaging.h is correct in that 'I;16' ought to be treated
as native byte order when it is used as an image mode (just like 'I'
is). However, as a raw mode, 'I;16' needs to mean 'little-endian'
just as 'I;32' means 'little endian'. This change wouldn't be too
hard to make, and it would be (mostly) backwards compatible.
However, having one name for two different entities (a 'mode' with
native order and a 'raw mode' with little-endian order) is likely to
be very confusing and the source of future bugs. It seems like the
better solution would be to add a new '16-bit unsigned native byte
order' image type to PIL -- maybe 'S' for 'short' -- and reserve 'I;
16' and 'I;16B' strictly for raw modes. The only problem would be
that this would break some older code that relied on these
Is anyone interested in discussing these options (and several other
bugs in PIL's image packing and unpacking that I've discovered in
looking at the code)? I'm happy to take this on, since these are
changes I need to make anyway for my project and I'd rather see them
in the PIL trunk than in my own fork.
Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine
More information about the Image-SIG