Hello, what's the current status on numpy for loading bit-arrays ?
I'm currently unable to correctly load black and white (1-bit) TIFF images. Code example follows:
from PIL import Image import numpy from matplotlib import pyplot
img = Image.open('oi-00.tiff') a = numpy.array(img)
^ does not work for 1-bit TIFF images
PIL source shows that it incorrectly uses typestr == '|b1'. I tried to change this to '|t1', but I get :
TypeError: data type "|t1" not understood
My goal is to make the above code to work for black and white TIFF images the same way it works for grayscale images. Any help ?
Any help with this problem ?
Does using pyplot.imgread work?
Paul, yes, imread() worked for reading the black and white TIFF. The situation improved, but now, there seems to be some problem with the color map. Example code:
#!/usr/bin/env python3 import numpy from matplotlib import pyplot, cm
img = pyplot.imread('oi-00.tiff') pyplot.imshow(img) pyplot.colorbar() pyplot.show()
The code can open both 1-bit and 8-bit images, but only with 8 bits the image is shown with the colormap colors. The 1 bit image is shown as black and white.
The questions: 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in PIL ? 2) Why isn't the colormap working with black and white images ?
What kind of array is "img"? What is its dtype and shape?
plt.imshow() will use the default colormap for matplotlib if the given array is just 2D. But if it is 3D (a 2D array of RGB[A] channels), then it will forego the colormap and utilize that for the colors. It knows nothing of the colormap contained in the TIFF.
Ben Root
For 1 bit images, the resulting array has shape (256, 256, 4). For grayscale images, the shape is (256, 256). So the image seems to have been loaded as a color image.
I think in any case, the result is unexpected, PIL is loading garbage from memory when loading black and white images because it sends the wrong buffer size, and matplotlib correctly loads the black and white image, but stores it in a 3D array.
What behavior is unexpected? For the (256, 256) images, matplotlib applies its default colormap to the grayscale (v1.5 and previous, that is jet, +v2.0, that will be viridis). The numpy array as loaded from PIL will never carry any additional information that came from the TIFF.
As for PIL, it will return an RGB[A] array if there is colormap data in the TIFF. If there is no colormap specified in the TIFF, it'll give you a simple 2D array. Now, maybe you'd like it to always return an RGB[A] array, but without a colormap in the TIFF, it makes sense to return the data as-is. This makes sense for people treating the TIFF as a data format rather than a visualization data format.
Ben Root
I agree with everything, but the unexpected are something else.
The TIFF images have no colormap. The colormap that I'm referring to is the GUI colormap, used by matplotlib to draw the image (imshow parameter cmap).
The problematic image format is the black and white 1-bit TIFF format. It is a bit array format, all bits are packed in sequence.
PIL passes this kind of image to numpy through the array_interface getter, which returns an array description of shape = (256, 256), type string "|b1" and data is a 8192 byte array (256 * 256 * 1 bit). This description is invalid and causes numpy to load 65536 bytes from memory, causing a buffer overflow (even though it does not crash). This is unexpected #1.
matplotlib.imread(), when loading a 8-bit grayscale image creates an array of shape (256, 256). matplotlib.imread(), when loading a 1-bit black and white image creates an array of shape (256, 256, 4) by first converting the black and white image to RGBA. This difference between grayscale and black and white is unexpected #2.
