numarray speed question
grv575 at hotmail.com
Mon Aug 9 07:17:13 CEST 2004
cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in
<qnkn015ujoh.fsf at arbutus.physics.mcmaster.ca>:
>At some point, grv575 at hotmail.com (grv575) wrote:
>> Heh. Try timing the example I gave (a += 5) using byteswapped vs.
>> byteswap(). It's fairly fast to do the byteswap. If you go the
>> interpretation way (byteswapped) then all subsequent array operations
>> are at least an order of magnitude slower (5 million elements test
>You mean something like
>a = arange(0, 5000000, type=Float64).byteswapped()
>a += 5
>a = arange(0, 5000000, type=Float64)
>a += 5
>? I get the same time for the a+=5 in each case -- and it's only twice
>as slow as operating on a non-byteswapped version. Note that numarray
>calls the ufunc add routine with non-byteswapped numbers; it takes a
>block, orders it correctly, then adds 5 to that, does the byteswap on
>the result, and stores that back. (You're not making a full copy of
>the array; just a large enough section at a time to do useful work.)
It must be using some sort of cache for the multiplication. Seems like on
the first run it takes 6 seconds and subsequently .05 seconds for either
>Maybe what you need is a package designed for *small* arrays ( < 1000).
>Simple C wrappers; just C doubles and ints, no byteswap, non-aligned.
>Maybe a fixed number of dimensions. Probably easy to throw something
>together using Pyrex. Or, wrap blitz++ with boost::python.
I'll check out Numeric first. Would rather have a drop-in solution (which
hopefully will get more optimized in future releases) rather than hacking
my own wrappers. Is it some purist mentality that's keeping numarray from
dropping to C code for the time-critical routines? Or can a lot of the
speed issues be attributed to the overhead of using objects for the library
(numarray does seem more general)?
More information about the Python-list