On Fri, Nov 19, 2010 at 8:42 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
On Fri, Nov 19, 2010 at 7:19 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>
>> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman <kwgoodman@gmail.com>
>> wrote:
>> > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen <pav@iki.fi> wrote:
>> >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote:
>> >> [clip]
>> >>> My guess is that having separate underlying functions for each dtype,
>> >>> ndim, and axis would be a nightmare for a large project like Numpy.
>> >>> But
>> >>> manageable for a focused project like nanny.
>> >>
>> >> Might be easier to migrate the nan* functions to using Ufuncs.
>> >>
>> >> Unless I'm missing something,
>> >>
>> >>        np.nanmax -> np.fmax.reduce
>> >>        np.nanmin -> np.fmin.reduce
>> >>
>> >> For `nansum`, we'd need to add an ufunc `nanadd`, and for
>> >> `nanargmax/min`, we'd need `argfmin/fmax'.
>> >
>> > How about that! I wasn't aware of fmax/fmin. Yes, I'd like a nanadd,
>> > please.
>> >
>> >>> arr = np.random.rand(1000, 1000)
>> >>> arr[arr > 0.5] = np.nan
>> >>> np.nanmax(arr)
>> >   0.49999625409581072
>> >>> np.fmax.reduce(arr, axis=None)
>> > <snip>
>> > TypeError: an integer is required
>> >>> np.fmax.reduce(np.fmax.reduce(arr, axis=0), axis=0)
>> >   0.49999625409581072
>> >
>> >>> timeit np.fmax.reduce(np.fmax.reduce(arr, axis=0), axis=0)
>> > 100 loops, best of 3: 12.7 ms per loop
>> >>> timeit np.nanmax(arr)
>> > 10 loops, best of 3: 39.6 ms per loop
>> >
>> >>> timeit np.nanmax(arr, axis=0)
>> > 10 loops, best of 3: 46.5 ms per loop
>> >>> timeit np.fmax.reduce(arr, axis=0)
>> > 100 loops, best of 3: 12.7 ms per loop
>>
>> Cython is faster than np.fmax.reduce.
>>
>> I wrote a cython version of np.nanmax, called nanmax below. (It only
>> handles the 2d, float64, axis=None case, but since the array is large
>> I don't think that explains the time difference).
>>
>> Note that fmax.reduce is slower than np.nanmax when there are no NaNs:
>>
>> >> arr = np.random.rand(1000, 1000)
>> >> timeit np.nanmax(arr)
>> 100 loops, best of 3: 5.82 ms per loop
>> >> timeit np.fmax.reduce(np.fmax.reduce(arr))
>> 100 loops, best of 3: 9.14 ms per loop
>> >> timeit nanmax(arr)
>> 1000 loops, best of 3: 1.17 ms per loop
>>
>> >> arr[arr > 0.5] = np.nan
>>
>> >> timeit np.nanmax(arr)
>> 10 loops, best of 3: 45.5 ms per loop
>> >> timeit np.fmax.reduce(np.fmax.reduce(arr))
>> 100 loops, best of 3: 12.7 ms per loop
>> >> timeit nanmax(arr)
>> 1000 loops, best of 3: 1.17 ms per loop
>
> There seem to be some odd hardware/compiler dependencies. I get quite a
> different pattern of times:
>
> In [1]: arr = np.random.rand(1000, 1000)
>
> In [2]: timeit np.nanmax(arr)
> 100 loops, best of 3: 10.4 ms per loop
>
> In [3]: timeit np.fmax.reduce(arr.flat)
> 100 loops, best of 3: 2.09 ms per loop
>
> In [4]: arr[arr > 0.5] = np.nan
>
> In [5]: timeit np.nanmax(arr)
> 100 loops, best of 3: 12.9 ms per loop
>
> In [6]: timeit np.fmax.reduce(arr.flat)
> 100 loops, best of 3: 7.09 ms per loop
>
>
> I've tweaked fmax with the reduce loop option but the nanmax times don't
> look like yours at all. I'm also a bit surprised that
> you don't see any difference in times when the array contains a lot of nans.
> I'm running on AMD Phenom, gcc 4.4.5.

Ubuntu 10.04 64 bit, numpy 1.4.1.

Difference in which times? nanny.nanmax with and wintout NaNs? The
code doesn't explictily check for NaNs (it does check for all NaNs).
It basically loops through the data and does:

allnan = 1
ai = ai[i,k]
if ai > amax:
   amax = ai
   allnan = 0

I should make a benchmark suite.
_

This doesn't look right:

@cython.boundscheck(False)
@cython.wraparound(False)
def nanmax_2d_float64_axisNone(np.ndarray[np.float64_t, ndim=2] a):
    "nanmax of 2d numpy array with dtype=np.float64 along axis=None."
    cdef Py_ssize_t i, j
    cdef int arow = a.shape[0], acol = a.shape[1], allnan = 1
    cdef np.float64_t amax = 0, aij
    for i in range(arow):
        for j in range(acol):
            aij = a[i,j]
            if aij == aij:
                amax += aij
                allnan = 0
    if allnan == 0:
        return np.float64(amax)
    else:
        return NAN
 It's doing a sum, not a comparison.

Chuck