[Numpy-discussion] How to set array values based on a condition?

Anne Archibald peridot.faceted at gmail.com
Sun Mar 23 02:41:57 EDT 2008


On 23/03/2008, Damian Eads <eads at soe.ucsc.edu> wrote:
> Hi,
>
>  I am working on a memory-intensive experiment with very large arrays so
>  I must be careful when allocating memory. Numpy already supports a
>  number of in-place operations (+=, *=) making the task much more
>  manageable. However, it is not obvious to me out I set values based on a
>  very simple condition.
>
>  The expression
>
>    y[y<0]=-1
>
>  generates a binary index mask y>=0 of the same size as the array y,
>  which is problematic when y is quite large.
>
>  I was wondering if there was anything like a set_where(A, cmp, B,
>  setval, [optional elseval]) function where cmp would be a comparison
>  operator expressed as a string.
>
>  The code below illustrates what I want to do. Admittedly, it needs to be
>  cleaned up but it's a proof of concept. Does numpy provide any functions
>  that support the functionality of the code below?

That's a good question, but I'm pretty sure it doesn't, apart from
numpy.clip(). The way I'd try to solve that problem would be with the
dreaded for loop. Don't iterate over single elements, but if you have
a gargantuan array, working in chunks of ten thousand (or whatever)
won't have too much overhead:

block = 100000
for n in arange(0,len(y),block):
    yc = y[n:n+block]
    yc[yc<0] = -1

It's a bit of a pain, but working with arrays that nearly fill RAM
*is* a pain, as I'm sure you are all too aware by now.

You might look into numexpr, this is the sort of thing it does (though
I've never used it and can't say whether it can do this).

Anne



More information about the NumPy-Discussion mailing list