[Numpy-discussion] NEP for faster ufuncs
Francesc Alted
faltet at pytables.org
Wed Dec 22 13:41:10 EST 2010
A Wednesday 22 December 2010 18:21:28 Mark Wiebe escrigué:
> On Wed, Dec 22, 2010 at 9:07 AM, Francesc Alted <faltet at pytables.org>
wrote:
> > A Wednesday 22 December 2010 17:25:13 Mark Wiebe escrigué:
> > > Can you print out your np.__version__, and try running the tests?
> > > If newiter didn't build for some reason, its tests should be
> > > throwing a bunch of exceptions.
$ PYTHONPATH=numpy python -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 2.0.0.dev-147f817
NumPy is installed in /tmp/numpy/numpy
Python version 2.6.1 (r261:67515, Feb 3 2009, 17:34:37) [GCC 4.3.2
[gcc-4_3-branch revision 141291]]
nose version 0.11.0
[clip]
Warning: divide by zero encountered in log
Warning: divide by zero encountered in log
[clip]
Ran 3094 tests in 16.771s
OK (KNOWNFAIL=4, SKIP=1)
IPython seems to work well too:
>>> np.__version__
'2.0.0.dev-147f817'
>>> timeit 3*a+b-(a/c)
10 loops, best of 3: 67.5 ms per loop
However, when trying you luf function:
>>> cpaste
[the luf code here]
--
>>> timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
[clip]
AttributeError: 'module' object has no attribute 'newiter'
> The reason I think it might help is that with 'luf' is that it's
> calculating the expression on smaller sized arrays, which possibly
> just got buffered. If the memory allocator for the temporaries keeps
> giving back the same addresses, all this will be in one of the
> caches very close to the CPU. Unless this cache is still too slow to
> feed the SSE instructions, there should be a speed benefit. The
> ufunc inner loops could also use the SSE prefetch instructions based
> on the stride to give some strong hints about where the next memory
> bytes to use will be.
Ah, okay. However, Numexpr is not meant to accelerate calculations with
small operands. I suppose that this is where your new iterator makes
more sense: accelerating operations where some of the operands are small
(i.e. fit in cache) and have to be broadcasted to match the
dimensionality of the others.
--
Francesc Alted
More information about the NumPy-Discussion
mailing list