[Numpy-discussion] np.mean and np.std performances
Davide Lasagna
lasagnadavide at gmail.com
Sun Apr 18 08:16:00 EDT 2010
Hi all,
I noticed some performance problems with np.mean and np.std functions.
Here is the console output in ipython:
# make some test data
>>>: a = np.arange(80*64, dtype=np.float64).reshape(80, 64)
>>>: c = np.tile( a, [10000, 1, 1])
>>>: timeit np.mean(c, axis=0)
1 loops, best of 3: 2.09 s per loop
But using reduce is much faster:
def mean_reduce(c):
return reduce(lambda som, array: som+array, c) / c.shape[0]
>>>:timeit mean_reduce(c)
1 loops, best of 3: 355 ms per loop
The same applies to np.std():
# slighlty smaller c matrix (too much memory is used)
>>>: c = np.tile( a, [7000, 1, 1])
>>>: timeit np.std(c, axis=0)
1 loops, best of 3: 3.73 s per loop
With the reduce version:
def std_reduce(c):
c -= mean_reduce(c)
return np.sqrt( reduce(lambda som, array: som + array**2, c ) /
c.shape[0] )
>>>: timeit std_reduce(c)
1 loops, best of 3: 1.18 s per loop
For the std function also look at the memory usage during the execution of
the function.
The functions i gave here can be easily modified to accept an axis option
and other stuff needed.
Is there any drawback of using them? Why np.mean and np.std are so slow?
I'm sure I'm missing something.
Cheers
Davide
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100418/4bf75a15/attachment.html>
More information about the NumPy-Discussion
mailing list