fromimage/imread segfaults on my images
Hi all, I'm using SciPy to process some sparse binary images with a resolution of about 5000x5000 pixels. I try to load images like this: im = Image.open('/tmp/foo.pbm') print "Image loaded, size is %s and mode is %s." % (im.size, im.mode) arr = fromimage(im) or arr = imread('/tmp/foo.pbm') In either case, I get a segfault. The segfault is definitely in the SciPy code, rather than the PIL code, since PIL's Image.open runs fine. I ran strace, and there's an attempt to mmap() a region as large as the image, one byte per pixel, immediately before the segfault: mmap(NULL, 21463040, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f278b5be000 The funny thing is... the imread() implementation in pylab *does* work, but it converts all images to RGB mode and reverses them vertically for no apparent reason. So I'd rather not use it. Is this a known bug in SciPy? Any workarounds/fixes? I don't know any other good, reliable way to get a big image into an array :-( Dan
On Mon, 08 Mar 2010 15:09:37 +0000, Daniel Lenski wrote:
I ran strace, and there's an attempt to mmap() a region as large as the image, one byte per pixel, immediately before the segfault:
mmap(NULL, 21463040, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f278b5be000
Just to clarify... this system is running Linux 2.6, amd64 build, and has 4gb of RAM. So there's no reason why a 21 MB mmap would fail, normally. Dan
Daniel Lenski wrote:
I'm using SciPy to process some sparse binary images with a resolution of about 5000x5000 pixels. I try to load images like this:
im = Image.open('/tmp/foo.pbm') print "Image loaded, size is %s and mode is %s." % (im.size, im.mode) arr = fromimage(im)
or
arr = imread('/tmp/foo.pbm')
where do fromimage and imread come from? "namespaces are one honking great idea".
In either case, I get a segfault. The segfault is definitely in the SciPy code, rather than the PIL code, since PIL's Image.open runs fine.
don't be so sure -- I think PIL used lazy loading, so it doesn't actually read all the data with the open call, but rather when you try to do something with it. I'd try making a few calls on the PIl image, and make sure it is what you expect. If it is, the easiest way to get it into a numpy array is: np.asarray(pil_image) do you have a smaller image in the same format you can experiment with? That might make it easier to figure out.
I don't know any other good, reliable way to get a big image into an array :-(
if you really can't get PIL to work, pbm looks really simple: http://netpbm.sourceforge.net/doc/pbm.html read the header, then read the data with np.fromfile(), then convert to a uint8 array with np.unpackbits(). You could also take a look at MPL's imread() and see what it does -- I don't think MPL requires PIL, though I could be wrong. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, 08 Mar 2010 11:58:08 -0800, Christopher Barker wrote:
Daniel Lenski wrote:
I'm using SciPy to process some sparse binary images with a resolution of about 5000x5000 pixels. I try to load images like this:
im = Image.open('/tmp/foo.pbm') print "Image loaded, size is %s and mode is %s." % (im.size, im.mode) arr = fromimage(im)
or
arr = imread('/tmp/foo.pbm')
where do fromimage and imread come from? "namespaces are one honking great idea".
Sorry! That'd be from scipy.misc import fromimage, imread
don't be so sure -- I think PIL used lazy loading, so it doesn't actually read all the data with the open call, but rather when you try to do something with it.
I'd try making a few calls on the PIl image, and make sure it is what you expect. If it is, the easiest way to get it into a numpy array is:
True, PIL lazy-loads the image. However, I've checked this, and I'm able to manipulate the image via PIL with no problems.
np.asarray(pil_image)
Yeah, I looked at the scipy.misc.fromimage code and this is all it does, basically.
do you have a smaller image in the same format you can experiment with? That might make it easier to figure out.
Yes, smaller images in the same format work fine!
I don't know any other good, reliable way to get a big image into an array :-(
if you really can't get PIL to work, pbm looks really simple:
http://netpbm.sourceforge.net/doc/pbm.html
read the header, then read the data with np.fromfile(), then convert to a uint8 array with np.unpackbits().
That's what I've ended up doing...
You could also take a look at MPL's imread() and see what it does -- I don't think MPL requires PIL, though I could be wrong.
MPL does require PIL for formats other than PNG. Dan
On Mon, 08 Mar 2010 11:58:08 -0800, Christopher Barker wrote:
Daniel Lenski wrote:
I'm using SciPy to process some sparse binary images with a resolution of about 5000x5000 pixels. I try to load images like this:
im = Image.open('/tmp/foo.pbm') print "Image loaded, size is %s and mode is %s." % (im.size, im.mode) arr = fromimage(im)
or
arr = imread('/tmp/foo.pbm')
where do fromimage and imread come from? "namespaces are one honking great idea".
Whoops, sorry. Those are from scipy.misc.
In either case, I get a segfault. The segfault is definitely in the SciPy code, rather than the PIL code, since PIL's Image.open runs fine.
don't be so sure -- I think PIL used lazy loading, so it doesn't actually read all the data with the open call, but rather when you try to do something with it.
I'd try making a few calls on the PIl image, and make sure it is what you expect. If it is, the easiest way to get it into a numpy array is:
Everything works fine as long as I stick with PIL calls only... I can even dump the entire contents of the image to a string with PIL.Image.fromstring(). No problems there.
np.asarray(pil_image)
I looked at the code for scipy.misc.fromimage, and this is basically all it does. This is where the segfault occurs.
do you have a smaller image in the same format you can experiment with? That might make it easier to figure out.
Smaller images work fine. I haven't figured out exactly where the cutoff is, but 500x500 works fine, for instance.
I don't know any other good, reliable way to get a big image into an array :-(
if you really can't get PIL to work, pbm looks really simple:
http://netpbm.sourceforge.net/doc/pbm.html
read the header, then read the data with np.fromfile(), then convert to a uint8 array with np.unpackbits().
That's what I've ended up doing... rolled my own loadpbm() function.
You could also take a look at MPL's imread() and see what it does -- I don't think MPL requires PIL, though I could be wrong.
MPL requires PIL to load and save images in formats /other than/ PNG. Dan
Daniel Lenski wrote:
Everything works fine as long as I stick with PIL calls only... I can even dump the entire contents of the image to a string with PIL.Image.fromstring(). No problems there.
did you try passing that to numpy.fromstring? And what did it dump? single images can be tricky.
np.asarray(pil_image)
I looked at the code for scipy.misc.fromimage, and this is basically all it does. This is where the segfault occurs.
I don't think that code path has been highly tested -- and maybe never for 1-bit images -- it eventually will be deprecated in favor of the new buffer protocol anyway.
do you have a smaller image in the same format you can experiment with? That might make it easier to figure out.
Smaller images work fine. I haven't figured out exactly where the cutoff is, but 500x500 works fine, for instance.
I'd be surprised if size is the issue here -- your images just aren't that big (as you pointed out, the MPL functions worked even though they were converting to RGB). I'm just guessing that you problem had something to do with the bits->bytes conversion.
http://netpbm.sourceforge.net/doc/pbm.html
read the header, then read the data with np.fromfile(), then convert to a uint8 array with np.unpackbits().
That's what I've ended up doing... rolled my own loadpbm() function.
Then I guess you're done! have fun, sorry this was harder than it should have been. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (2)
-
Christopher Barker
-
Daniel Lenski