Numeric/numarray compatibility issue

Following a bug report concerning ScientificPython with numarray, I noticed an incompatibility between Numeric and numarray, and I am wondering if this is intentional. In Numeric, the result of a comparison operation is an integer array. In numarray, it is a Bool array. Bool arrays seem to behave like Int8 arrays when arithmetic operations are applied. The net result is that print n.add.reduce(n.greater(n.arange(128), -1)) yields -128, which is not what I would expect. I can see two logically coherent points of views: 1) The Numeric view: comparisons yield integer arrays, which may be used freely in arithmetic. 2) The "logician's" view: comparisons yield arrays of boolean values, on which no arithmetic is allowed at all, only logical operations. The first approach is a lot more pragmatic, because there are a lot of useful idioms that use the result of comparisons in arithmetic, whereas an array of boolean values cannot be used for much else than logical operations. And now for my pragmatic question: can anyone come up with a solution that will work under both Numeric an numarray, won't introduce a speed penalty under Numeric, and won't leave the impression that the programmer had had too many beers? There is the quick hack print n.add.reduce(1*n.greater(n.arange(128), -1)) but it doesn't satisfy the last two criteria. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

On Mar 3, 2005, at 12:31 PM, konrad.hinsen@laposte.net wrote:
Following a bug report concerning ScientificPython with numarray, I noticed an incompatibility between Numeric and numarray, and I am wondering if this is intentional.
In Numeric, the result of a comparison operation is an integer array. In numarray, it is a Bool array. Bool arrays seem to behave like Int8 arrays when arithmetic operations are applied. The net result is that
print n.add.reduce(n.greater(n.arange(128), -1))
yields -128, which is not what I would expect.
I can see two logically coherent points of views:
1) The Numeric view: comparisons yield integer arrays, which may be used freely in arithmetic.
2) The "logician's" view: comparisons yield arrays of boolean values, on which no arithmetic is allowed at all, only logical operations.
The first approach is a lot more pragmatic, because there are a lot of useful idioms that use the result of comparisons in arithmetic, whereas an array of boolean values cannot be used for much else than logical operations.
And now for my pragmatic question: can anyone come up with a solution that will work under both Numeric an numarray, won't introduce a speed penalty under Numeric, and won't leave the impression that the programmer had had too many beers? There is the quick hack
print n.add.reduce(1*n.greater(n.arange(128), -1))
but it doesn't satisfy the last two criteria.
First of all, isn't the current behavior a little similar to Python in that Python Booleans aren't pure either (for backward compatibility purposes)? I think this has come up in the past, and I thought that one possible solution was to automatically coerce all integer reductions and accumulations to Int32 to avoid overflow issues. That had been discussed before and apparently many preferred avoiding automatic promotion (the reductions allow specifying a new type for the reduction, but I don't believe that helps your specific example for code that works for both). Using .astype(Int32) should work for both, right? (or is that too much of a speed hit?) But it is a fair question to ask if arithmetic operations should be allowed on booleans without explicit casts. Perry

Perry Greenfield wrote:
On Mar 3, 2005, at 12:31 PM, konrad.hinsen@laposte.net wrote:
print n.add.reduce(n.greater(n.arange(128), -1))
yields -128, which is not what I would expect.
I think this has come up in the past,
It has. I think I commented on it some time back, and the consensus was that, as Perry suggested, using .astype(Int32) is the best fix. I think the fact that arithmetic is allowed on booleans without casts is an oversight; standard Python 2.3 allows you to do True+False. Fortran would never let you do .TRUE.+.FALSE. :-) .

On 04.03.2005, at 16:44, Perry Greenfield wrote:
First of all, isn't the current behavior a little similar to Python in that Python Booleans aren't pure either (for backward compatibility purposes)?
Possibly, but the use of boolean scalars and boolean arrays is very different, so that's not necessarily the model to follow.
apparently many preferred avoiding automatic promotion (the reductions allow specifying a new type for the reduction, but I don't believe that helps your specific example for code that works for both). Using .astype(Int32)
Right, because it doesn't work with Numeric.
should work for both, right? (or is that too much of a speed hit?) But it is a
Yes, but it costs both time and memory. I am more worried about the memory, since this is one of the few operations that I do mostly with big arrays. Under Numeric, this doubles memory use, costs time, and makes no difference for the result. I am not sure that numarray compatibility is worth that much for me (OK, there is a dose of laziness in that argument as well).
fair question to ask if arithmetic operations should be allowed on booleans without explicit casts.
What is actually the difference between Bool and Int8? On 04.03.2005, at 18:27, Stephen Walton wrote:
It has. I think I commented on it some time back, and the consensus was that, as Perry suggested, using .astype(Int32) is the best fix. I think the fact that arithmetic is allowed on booleans without casts is an oversight; standard Python 2.3 allows you to do True+False. Fortran would never let you do .TRUE.+.FALSE. :-) .
I am in fact not convinced that adding booleans to Python was a very good idea, for exactly that reason: they try to be both booleans and compatible with integers. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

On Mar 4, 2005, at 1:24 PM, konrad.hinsen@laposte.net wrote:
On 04.03.2005, at 16:44, Perry Greenfield wrote:
First of all, isn't the current behavior a little similar to Python in that Python Booleans aren't pure either (for backward compatibility purposes)?
Possibly, but the use of boolean scalars and boolean arrays is very different, so that's not necessarily the model to follow.
No, but that some people know that arithmetic can be done with Python Booleans may lead them to think the same should be possible with Boolean arrays (not that should be the sole criteria).
for both). Using .astype(Int32)
Right, because it doesn't work with Numeric.
should work for both, right? (or is that too much of a speed hit?) But it is a
Yes, but it costs both time and memory. I am more worried about the memory, since this is one of the few operations that I do mostly with big arrays. Under Numeric, this doubles memory use, costs time, and makes no difference for the result. I am not sure that numarray compatibility is worth that much for me (OK, there is a dose of laziness in that argument as well).
Hmmm, I'm a little confused here. If the overflow issue is what you are worried about, then use of Int8 for boolean results would still be a problem here. Since Numeric is already likely generating Int32 from logical ufuncs (Int actually), the use of astype(Int) is little different than many of the temporaries that Numeric creates in expressions. I find it hard to believe that this is a make or break issue for Numeric users since it typically generates more temporaries than does numarray.
fair question to ask if arithmetic operations should be allowed on booleans without explicit casts.
What is actually the difference between Bool and Int8?
I'm not sure I remember all the differences (Todd can add to this if he remembers better). Booleans are treated differently as array indices than Int8 arrays are. The machinery of generating Boolean results is different in that it forces results to be either 0 or 1. In other words, Boolean arrays should only have 0 or 1 values in those bytes (not that it isn't possible for someone to break this in C code or though undiscovered bugs. Ufuncs that generate different values such as arithmetic operators result in a different type. Perry

On Fri, 2005-03-04 at 13:50, Perry Greenfield wrote:
On Mar 4, 2005, at 1:24 PM, konrad.hinsen@laposte.net wrote:
What is actually the difference between Bool and Int8?
I'm not sure I remember all the differences (Todd can add to this if he remembers better). Booleans are treated differently as array indices than Int8 arrays are. The machinery of generating Boolean results is different in that it forces results to be either 0 or 1.
Conversions to Bool, logical operations, and (implicitly) comparisons constrain values to 0 or 1.
In other words, Boolean arrays should only have 0 or 1 values in those bytes (not that it isn't possible for someone to break this in C code or though undiscovered bugs. Ufuncs that generate different values such as arithmetic operators result in a different type.
More general arithmetic appears to have unconstrained results.

On 04.03.2005, at 19:50, Perry Greenfield wrote:
Hmmm, I'm a little confused here. If the overflow issue is what you are worried about, then use of Int8 for boolean results would still be a problem
Yes. The question about the difference was just out of curiosity.
here. Since Numeric is already likely generating Int32 from logical ufuncs (Int actually), the use of astype(Int) is little different than many of the temporaries that Numeric creates in expressions. I find it hard to believe
It's the same, but it's one more. The only one is some of my large-array code, as I have carefully used the three-argument forms of the binary operators to avoid intermediate results. I can't do that for comparisons between float arrays. After some consideration, I think the best solution is a special "sum integer array" function in my Numeric/numarray adaptor module (the one that chooses which module to import). The numarray version can then use the type specifier in the reduction. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

On Fri, 2005-03-04 at 12:18, konrad.hinsen@laposte.net wrote:
On 04.03.2005, at 19:50, Perry Greenfield wrote: After some consideration, I think the best solution is a special "sum integer array" function in my Numeric/numarray adaptor module (the one that chooses which module to import). The numarray version can then use the type specifier in the reduction.
That ufunc.reduce takes an optional type specifier was news to me. Neither the manual nor the on-line help mentions it. ralf
participants (5)
-
konrad.hinsen@laposte.net
-
Perry Greenfield
-
Ralf Juengling
-
Stephen Walton
-
Todd Miller