Sun, 18 Jul 2010 15:57:47 -0600, Charles R Harris wrote:
On Sun, Jul 18, 2010 at 3:36 PM, Pauli Virtanen <pav@iki.fi> wrote: [clip]
I suggest the following, aping the way the real nan works:
- (z, nan), (nan, z), (nan, nan), where z is any fp value, are all equivalent representations of "cnan", as far as comparisons, sort order, etc are concerned.
- The ordering between (z, nan), (nan, z), (nan, nan) is undefined. This
means e.g. that maximum([cnan_1, cnan_2]) can return either cnan_1 or cnan_2 if both are some cnans.
The sort and cmp order was defined in 1.4.0, see the release notes. (z,z), (z, nan), (nan, z), (nan, nan) are in correct order and there are tests to enforce this. Sort and searchsorted need to work together.
Ok, now we're diving into an obscure corner that hopefully many people don't care about :) There are several issues here: 1) We should not use lexical order in comparison operations, since this contradicts real-valued nan arithmetic. Currently (and in 1.4) we do some weird sort of mixture, which seems inconsistent. 2) maximum/minimum should propagate nans, fmax/fmin should not 3) sort/searchsorted, and amax/argmax need to play together 4) as long as 1)-3) are valid, I don't think anybody cares what what exactly we mean by a "complex nan", as long as np.isnan("complex nan") == True The fact that there are happen to be several different representations of a complex nan should not be important. *** 1) Unless we want to define (complex(nan, 0) > complex(0, 0)) == True we cannot strictly follow the lexical order in comparisons. And if we define it like this, we contradict real-valued nan arithmetic, which IMHO is quite bad. Here, it would make sense to me to lump all the different complex nans into a single "cnan", as far as the arithmetic comparison operations are concerned. Then, z OP cnan == False for all comparison operations. In 1.4.1 we have
import numpy as np np.__version__ '1.4.1' x = np.complex64(complex(np.nan, 1)) y = np.complex64(complex(0, 1)) x >= y False x < y False x = np.complex64(complex(1, np.nan)) y = np.complex64(complex(0, 1)) x >= y True x < y False
which seems an obscure mix of real-valued nan arithmetic and lexical ordering -- I don't think it's the correct choice... Of course, the practical importance of this decision approaches zero, but it would be nice to be consistent. *** 2) For maximum/amax, strict lexical order contradicts nan propagation: maximum(1+nan*j, 2+0j) == 2+0j ??? I don't see why we should follow the lexical order when both arguments are nans. The implementation will be faster if we don't. Also, this way argmax (which should be nan-propagating) can stop looking once it finds the first nan -- and it does not need to care if later on in the array there would be a "greater" nan. *** 3) For sort/searchsorted we have a technical reason to do something more, and there the strict lexical order seems the correct decision. For `argmax` it was possible to be compatible with `amax` when lumping cnans in maximum -- just return the first cnan. *** 4) As far as np.isnan is concerned,
np.isnan(complex(0, nan)) True np.isnan(complex(nan, 0)) True np.isnan(complex(nan, nan)) True
So I think nobody should care which complex nan a function such as maximum or amax returns. We can of course give up some performance to look for the "greatest" nan in these cases, but I do not think that it would be very worthwhile. -- Pauli Virtanen