[SciPy-User] ndimage filters overflow for default image format (uint8)

Tue Aug 4 05:43:39 EDT 2009

Hi Tony,
> I've been struggling to get ndimage filters to give the expected  
> output. It turns out that my images were getting imported as unsigned  
> integers (uint8) using both chaco's ImageData importer and PIL (with  
> an additional call to numpy.asarray). As a result, edge filters--- 
> which often return negative values---were returning overflowed arrays.  
>   
As you may know, grayscale images are represented with uint8 because of
a low memory footprint.
By the way, if you intent to perform some calculations, you need to be
careful.

Two different goals may be achieved :
- Getting the best precision you can get
- lowering the memory footprint as much as you can !
> Here's a simple example using the correlate1d function (which is used  
> by the edge filters):
>
>  >>> im = np.array([1, 2, 3, 4, 4, 3, 2, 1], dtype=np.uint8)
>  >>> weights = np.array([-1, 0, 1])
>  >>> print ndimage.correlate1d(im, weights)
> [  1   2   2   1 255 254 254 255]
>   
So, If you need precision first and you are not concerned by memory, you
can do something like :
>>> im = np.array([1, 2, 3, 4, 4, 3, 2, 1], dtype=np.uint8)
>>> weights = np.array([-1, 0, 1])
>>> z = ndimage.correlate1d( np.asarray(im,dtype=int32), weights)
>>> print z

[  1   2   2   1  -1 -2 -2 -1]

and 

>>> print z.dtype
int32

I choose int32 but I should have chosen something like double, float,
and even int16 until it is able to store the result of that filter.

Now, if you are more concerned about memory footpring than precision,
you can re-consider your example :

 >>> im = np.array([1, 2, 3, 4, 4, 3, 2, 1], dtype=np.uint8)
 >>> weights = np.array([-1, 0, 1])
 >>> z= ndimage.correlate1d(im, weights)
 >>> print z
[  1   2   2   1 255 254 254 255]

 >>> z .dtype = int8
 >>> print z

[  1   2   2   1  -1 -2 -2 -1]

Note that changing dtype don't realloc or copy datas, that is just
another way to view them.

Ok, one may argue that changing dtype is not enought, and that's right.
Filtered datas have a range from
-255 up to 254 but int8 can only represent integer from -128 to 127 so ?

 >>> im = np.array([1, 2, 3, 4, 4, 3, 2, 1], dtype=np.uint8)
 >>> weights = np.array([-.5, 0, .5])
 >>> z= ndimage.correlate1d(im, weights)
 >>> print 2*z
[  0   2   2   0 0 -2 -2 0]

So, now filtered datas may be stored as int8, but you loose half the precision (dividing by two filter coefficients). Most
 applications won't suffer from a limited precision. 

In my opinion, as a programmer you have to know the filtered datas range. So, you just have to decide if precision does care or not. 
You can't expect correlate1d to guess your intent for the resulting range. Just for thought : are you processing vga images in real time,
or satellite image of 10Mpx ? 

Best regards,

Gilles Rochefort

> This result is not a huge surprise *if* you expect the output of the  
> filter to be the same dtype as your input image, but this wasn't  
> apparent to me until I really dug into the problem. (I'd expect the  
> same dtype for filters that don't return values outside the range of  
> input values, e.g. a median filter).
>
> I realize guessing the "best" output format is difficult (b/c of  
> memory considerations and variable input formats), but it seems like  
> the current behavior would cause problems for a lot people. Maybe  
> there could be a dtype parameter to specify the output dtype. For  
> filters that can return values outside the input range (e.g < 0, >  
> 255), the default dtype could be int32. This of course assumes the  
> input is uint8, but maybe that's a good assumption since that seems to  
> be the default for PIL. Or there's probably a smarter way of choosing  
> the output format.
>
> I realize my suggestion could cause more problems than it's worth. If  
> so, maybe just adding some very prominent warnings in the filter  
> docstrings would do the trick.
>
> Best,
> -Tony
>
> PS. Sorry if this post shows up twice, I've been having trouble  
> posting with my other email address.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>