can this be made faster?
Andreas Eisele
eisele at dfki.de
Wed Oct 25 04:53:49 EDT 2006
Recently, there were several requests and discussions on this list about
how to
increment an array a in cells pointed to from a second integer array b
(optionally by
values from a third array c), such as:
> Yes, that'd be
> a[b] += c
>
> On 10/8/06, Daniel Mahler <dmahler at gm...> wrote:
> > Is there a 'loop free' way to do this in Numeric
> >
> > for i in arange(l):
> > a[b[i]]+=c[i]
> >
> > where l == len(b) == len(c)
> >
> > thanks
> > Daniel
>
>
or
> It is clear to me that the numpy += operator in combination with the use
> of arrays of indexes, as is explained in the Tentative Numpy Tutorial
>
> (http://www.scipy.org/Tentative_NumPy_Tutorial#head-3f4d28139e045a442f78c5218c379af64c2c8c9e),
>
> the limitation being that indexes that appear more than 1 time in the
> indexes-array will get incremented only once.
>
> Does anybody know a way to work around this?
>
> I am using this to fill up a custom nd-histogram, and obviously each bin
> should be able to get incremented more than once. Looping over the
> entire array and incrementing each bin succesively takes waaay to long
> (these are pretty large arrays, like 4000x2000 items, or even larger)
I just came across a function that seems to provide the solution to both
requests,
which is called bincount.
The first usecase could be written as
a += bincount(b,c)
(assuming a has already the right dimension, otherwise a = bincount(b,c)
would create an
array with the minimal required size), the second case is even simpler:
counts = bincount(index)
On my machine, this does 20M counting operations per second, which is _much_
faster than anything that could be done in an explicit for loop.
Hope this helps,
Andreas
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the NumPy-Discussion
mailing list