[Numpy-discussion] How to set array values based on a condition?

Francesc Altet faltet at carabos.com
Sun Mar 23 09:05:28 EDT 2008


A Sunday 23 March 2008, Anne Archibald escrigué:
> On 23/03/2008, Damian Eads <eads at soe.ucsc.edu> wrote:
> > Hi,
> >
> >  I am working on a memory-intensive experiment with very large
> > arrays so I must be careful when allocating memory. Numpy already
> > supports a number of in-place operations (+=, *=) making the task
> > much more manageable. However, it is not obvious to me out I set
> > values based on a very simple condition.
> >
> >  The expression
> >
> >    y[y<0]=-1
> >
> >  generates a binary index mask y>=0 of the same size as the array
> > y, which is problematic when y is quite large.
> >
> >  I was wondering if there was anything like a set_where(A, cmp, B,
> >  setval, [optional elseval]) function where cmp would be a
> > comparison operator expressed as a string.
> >
> >  The code below illustrates what I want to do. Admittedly, it needs
> > to be cleaned up but it's a proof of concept. Does numpy provide
> > any functions that support the functionality of the code below?
>
> That's a good question, but I'm pretty sure it doesn't, apart from
> numpy.clip(). The way I'd try to solve that problem would be with the
> dreaded for loop. Don't iterate over single elements, but if you have
> a gargantuan array, working in chunks of ten thousand (or whatever)
> won't have too much overhead:
>
> block = 100000
> for n in arange(0,len(y),block):
>     yc = y[n:n+block]
>     yc[yc<0] = -1
>
> It's a bit of a pain, but working with arrays that nearly fill RAM
> *is* a pain, as I'm sure you are all too aware by now.
>
> You might look into numexpr, this is the sort of thing it does
> (though I've never used it and can't say whether it can do this).

Well, Numexpr is designed to minimize the number of temporaries, and can 
do what Damian wants without requiring to put the mask in a temporary.  
However, the output will require new space.  The usage should be 
something like:

In [11]: y = numpy.random.normal(0, 10, 10)

In [12]: numexpr.evaluate('where(y<0, -1, y)')
Out[12]:
array([  7.11784295,  -1.        ,  10.92876842,  -1.        ,
         0.76092629,  -1.        ,  14.07021792,  -1.        ,
         5.67173405,  31.28631822])

HTH,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"



More information about the NumPy-Discussion mailing list