[Numpy-discussion] numpy.vectorize performance
Nick Fotopoulos
nvf at MIT.EDU
Fri Jul 14 12:43:38 EDT 2006
On Jul 13, 2006, at 10:17 PM, Tim Hochberg wrote:
> Nick Fotopoulos wrote:
>> Dear all,
>>
>> I often make use of numpy.vectorize to make programs read more
>> like the physics equations I write on paper. numpy.vectorize is
>> basically a wrapper for numpy.frompyfunc. Reading Travis's Scipy
>> Book (mine is dated Jan 6 2005) kind of suggests to me that it
>> returns a full- fledged ufunc exactly like built-in ufuncs.
>>
>> First, is this true?
> Well according to type(), the result of frompyfunc is indeed of
> type ufunc, so I would say the answer to that is yes.
>> Second, how is the performance?
> A little timing indicates that it's not good (about 30 X slower for
> computing x**2 than doing it using x*x on an array). . That's not
> frompyfunc (or vectorizes) fault though. It's calling a python
> function at each point, so the python function call overhead is
> going to kill you. Not to mention instantiating an actual Python
> object or objects at each point.
That's unfortunate since I tend to nest functions quite deeply and
then scipy.integrate.quad over them, which I'm sure results in a
ridiculous number of function calls. Are anonymous lambdas any
different than named functions in terms of performance?
>
>> i.e., are my functions performing approximately as fast as they
>> could be or would they still gain a great deal of speed by
>> rewriting it in C or some other compiled python accelerator?
>>
> Can you give examples of what these functions look like? You might
> gain a great deal of speed by rewriting them in numpy in the
> correct way. Or perhaps not, but it's probably worth showing some
> examples so we can offer suggestions or at least admit that we are
> stumped.
This is by far the slowest bit of my code. I cache the results, so
it's not too bad, but any upstream tweak can take a lot of CPU time
to propagate.
@autovectorized
def dnsratezfunc(z):
"""Take coalescence time into account.""
def integrand(zf):
return Pz(z,zf)*NSbirthzfunc(zf)
return quad(integrand,delayedZ(2e5*secperyear+1,z),5)[0]
dnsratez = lambdap*dnsratezfunc(zs)
where:
# Neutron star formation rate is a delayed version of star formation
rate
NSbirthzfunc = autovectorized(lambda z: SFRz(delayedZ
(1e8*secperyear,z)))
def Pz(z_c,z_f):
"""Return the probability density per unit redshift of a DNS
coalescence at z_c given a progenitor formation at z_f. """
return P(t(z_c,z_f))*dtdz(z_c)
and there are many further nested levels of function calls. If the
act of calling a function is more expensive than actually executing
it and I value speed over readability/code reuse, I can inline Pz's
function calls and inline the unvectorized NSbirthzfunc to reduce the
calling stack a bit. Any other suggestions?
Thanks, Tim.
Take care,
Nick
More information about the NumPy-Discussion
mailing list