Hi All, I'm coding an iterative algorithm which needs to update a results array on each iteration. I have several (one-dimensional) arrays, and currently I have the following non-vectorized code which gets called at the end of each iteration (Inds, Current, and Update are all one- dimensional and the same size): for i in xrange(Indices.size): Current[Indices[i]] += Update[i] This is running too slowly, and so I am going through the usual process of attempting to vectorize this code (Indices.size > 5000). My attempt at vectorizing this was: Current[Indices] += Update But this does not yield the same result. Here's a simple example. First the iterative version:
Current = zeros(2,'d') Update = ones_like(Current) Indices = zeros(2,'i') for i in xrange(Indices.size): ... Current[Indices[i]] += Update[i] ... Current array([ 2., 0.])
And then my vectorized attempt:
Current = zeros(2,'d') Update = ones_like(Current) Indices = zeros(2,'i') Current[Indices] += Update Current array([ 1., 0.])
As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way? And possibly also about why my intuitions concerning the semantics of the vectorized code are wrong? Many thanks, Dan.
As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way? And possibly also about why my intuitions concerning the semantics of the vectorized code are wrong?
I'm also interested to know how to do this, I know it can be done with a C/Cython loop, but I'd like to do it just using numpy if possible. Dan G.
On Wed, Apr 29, 2009 at 08:03, Daniel Yarlett
As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way?
Use bincount().
And possibly also about why my intuitions concerning the semantics of the vectorized code are wrong?
In Python, the statement x[indices] += y turns into xi = x.__getitem__(indices) tmp = xi.__iadd__(y) x.__setitem__(indices, tmp) The first statement necessarily copies the data. Then the __iadd__() method modifies the copy in-place (tmp is xi after the operation for numpy arrays, but not necessarily for other objects). Then the final statement assigns the result back into the original array using fancy indexing. Since there are duplicate indices, the last entry in tmp for each duplicate index wins. Because Python breaks up the syntax "x[indices] += y" into those three discrete steps, information is lost on the way, and we cannot use that syntax with the semantics of the loop. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
On Wed, Apr 29, 2009 at 08:03, Daniel Yarlett
wrote: As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way?
Use bincount().
Neat. Is there a memory efficient way of doing it if the indices are very large but there aren't many of them? e.g. if the indices were I=[10000, 20000] then bincount would create a gigantic array of 20000 elements for just two addition operations! Dan G.
On Wed, Apr 29, 2009 at 16:19, Dan Goodman
Robert Kern wrote:
On Wed, Apr 29, 2009 at 08:03, Daniel Yarlett
wrote: As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way?
Use bincount().
Neat. Is there a memory efficient way of doing it if the indices are very large but there aren't many of them? e.g. if the indices were I=[10000, 20000] then bincount would create a gigantic array of 20000 elements for just two addition operations!
indices -= indices.min() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
On Wed, Apr 29, 2009 at 16:19, Dan Goodman
wrote: Robert Kern wrote:
On Wed, Apr 29, 2009 at 08:03, Daniel Yarlett
wrote: As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way? Use bincount(). Neat. Is there a memory efficient way of doing it if the indices are very large but there aren't many of them? e.g. if the indices were I=[10000, 20000] then bincount would create a gigantic array of 20000 elements for just two addition operations!
indices -= indices.min()
Ah OK, but bincount is still going to make an array of 10000 elements in that case... I came up with this trick, but I wonder whether it's overkill: Supposing you want to do y+=x[I] where x is big and indices in I are large but sparse (although potentially including repeats). J=unique(I) K=digitize(I,J)-1 b=bincount(K) y+=b*x[J] Well it seems to work, but surely there must be a nicer way? Dan
2009/4/29 Dan Goodman
Robert Kern wrote:
On Wed, Apr 29, 2009 at 16:19, Dan Goodman
wrote: Robert Kern wrote:
On Wed, Apr 29, 2009 at 08:03, Daniel Yarlett
wrote: As you can see, Current is different in the two cases. Any ideas how I can recreate the behavior of the iterative process in a more numpy- friendly, vectorized (and hopefully quicker) way? Use bincount(). Neat. Is there a memory efficient way of doing it if the indices are very large but there aren't many of them? e.g. if the indices were I=[10000, 20000] then bincount would create a gigantic array of 20000 elements for just two addition operations!
indices -= indices.min()
Ah OK, but bincount is still going to make an array of 10000 elements in that case...
I came up with this trick, but I wonder whether it's overkill:
Supposing you want to do y+=x[I] where x is big and indices in I are large but sparse (although potentially including repeats).
J=unique(I) K=digitize(I,J)-1 b=bincount(K) y+=b*x[J]
Well it seems to work, but surely there must be a nicer way?
Well, a few lines serve to rearrange the indices array so it contains each number only once, and then to add together (with bincount) all values with the same index: ix = np.unique1d(indices) reverse_index = np.searchsorted(ix, indices) r = np.bincount(reverse_index, weights=values) You can then do: y[ix] += r All this could be done away with if bincount supported input and output arguments. Then y[indices] += values could become np.bincount(indices, weights=values, input=y, output=y) implemented y the same nice tight loop one would use in C. (I'm not sure bincount is the right name for such a general function, though.) This question comes up fairly often on the mailing list, so it would be nice to have a single function to point to. Anne
participants (4)
-
Anne Archibald
-
Dan Goodman
-
Daniel Yarlett
-
Robert Kern