[Numpy-discussion] Overloading numpy's ufuncs for better type coercion?

Christopher Barker Chris.Barker at noaa.gov
Wed Jul 22 13:54:04 EDT 2009


Hans Meine wrote:
> In [3]: numpy.add(a, a, numpy.empty((1, ), dtype = numpy.uint32))
> Out[3]: array([144], dtype=uint32)

yes, it sure would be nice to fix this...

> one will often end up with uint8 arrays which cannot be passed 
> into many algorithms without an explicit conversion.  However, is this really 
> a bad problem?  For example, the conversion would typically have to be 
> performed only once (after loading), no?  Then, why not simplify things 
> further by adding a dtype= parameter to importImage()?  This could even 
> default to float32 then.

VIGRA specifically, this sounds like a fine way to go. how ever, for the 
broader numpy case:

I want to add two unit8 arrays, and put the results into a int32 array 
(for instance). As pointed out this doesn't work:

In [9]: a1 = np.array((200,), dtype= np.uint8)
In [10]: a2 = np.array((250,), dtype= np.uint8)

In [11]: a1 + a2
Out[11]: array([194], dtype=uint8)

As pointed out by others -- this is the "right" behavior - we really 
don't want upcasting without asking for it.

However, as above:
In [15]: np.add(a1, a2, np.empty(a1.shape, dtype = np.int32))
Out[15]: array([194])

I am asking for upcasting, so it's too bad I don't get it. The solution 
is to upcast ahead of time:

In [17]: np.add(a1.astype(np.int32), a2, np.empty(a1.shape, dtype = 
np.int32))
Out[17]: array([450])

or simply:

In [18]: a1.astype(np.int32) +a2
Out[18]: array([450])


Easy enough. The one downside is that an extra temporary in the larger 
type is needed (but only one). That could be an issue when working with 
large arrays, which I suspect is the case when these issues come up -- 
no one use the np.add() notation unless you are trying to manage memory 
more carefully.

Another way to write this is:

In [19]: a3 = a1.astype(np.int32)

In [20]: a3 += a2

In [21]: a3
Out[21]: array([450])

which I think avoids the extra temporary


This is a pretty special case, ands there ways to accomplish what is 
needed. I suspect that's why no one has "fixed" this yet.

-Chris





-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list