[Numpy-discussion] [ANN] Nanny, faster NaN functions
Wes McKinney
wesmckinn at gmail.com
Sat Nov 20 18:54:32 EST 2010
On Sat, Nov 20, 2010 at 6:39 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> I should make a benchmark suite.
>
>>> ny.benchit(verbose=False)
> Nanny performance benchmark
> Nanny 0.0.1dev
> Numpy 1.4.1
> Speed is numpy time divided by nanny time
> NaN means all NaNs
> Speed Test Shape dtype NaN?
> 6.6770 nansum(a, axis=-1) (500,500) int64
> 4.6612 nansum(a, axis=-1) (10000,) float64
> 9.0351 nansum(a, axis=-1) (500,500) int32
> 3.0746 nansum(a, axis=-1) (500,500) float64
> 11.5740 nansum(a, axis=-1) (10000,) int32
> 6.4484 nansum(a, axis=-1) (10000,) int64
> 51.3917 nansum(a, axis=-1) (500,500) float64 NaN
> 13.8692 nansum(a, axis=-1) (10000,) float64 NaN
> 6.5327 nanmax(a, axis=-1) (500,500) int64
> 8.8222 nanmax(a, axis=-1) (10000,) float64
> 0.2059 nanmax(a, axis=-1) (500,500) int32
> 6.9262 nanmax(a, axis=-1) (500,500) float64
> 5.0688 nanmax(a, axis=-1) (10000,) int32
> 6.5605 nanmax(a, axis=-1) (10000,) int64
> 48.4850 nanmax(a, axis=-1) (500,500) float64 NaN
> 14.6289 nanmax(a, axis=-1) (10000,) float64 NaN
>
> You can also use the makefile to run the benchmark: make bench
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
Keith (and others),
What would you think about creating a library of mostly Cython-based
"domain specific functions"? So stuff like rolling statistical
moments, nan* functions like you have here, and all that-- NumPy-array
only functions that don't necessarily belong in NumPy or SciPy (but
could be included on down the road). You were already talking about
this on the statsmodels mailing list for larry. I spent a lot of time
writing a bunch of these for pandas over the last couple of years, and
I would have relatively few qualms about moving these outside of
pandas and introducing a dependency. You could do the same for larry--
then we'd all be relying on the same well-vetted and tested codebase.
- Wes
More information about the NumPy-Discussion
mailing list