[Numpy-discussion] setmember1d memory leak?

Thu Jan 25 14:20:47 EST 2007

>
>
> > For instance
> >
> > In [7]: def countmembers(a1, a2) :
> >   ...:     a = sort(a2)
> >   ...:     il = a.searchsorted(a1, side='l')
> >   ...:     ir = a.searchsorted(a1, side='r')
> >   ...:     return ir - il
> >   ...:
> >
> > In [8]: a2 = random.randint(0,10,(100,))
> >
> > In [9]: a1 = arange(11)
> >
> > In [11]: a2 = random.randint(0,5,(100,))
> >
> > In [12]: a1 = arange(10)
> >
> > In [13]: countmembers(a1,a2)
> > Out[13]: array([16, 28, 16, 25, 15,  0,  0,  0,  0,  0])
> >
> >
> > The subtraction can be replaced by != to get a boolean mask.
>
> It looks good! Isn't it faster than setmember1d for unique input arrays?
> I do not like setmember1d much (it is long unlike other functions in
> arraysetops and looks clumsy to me now and I do not understand it
> anymore...), so feel free to replace it.
>
> BTW. setmember1d gives me the same mask as countmembers for several
> non-unique inputs I tried...
>
> r.

Try this, then:
countmembers(N.array([1,1]), N.array([2]))
array([0, 0])
N.setmember1d(N.array([1,1]), N.array([2]))
array([ True, False], dtype=bool)

setmember1d really needs the first array to be unique. I thought about it
quite a bit and tried to understand the code (which is no small feat and I
don't claim I have succeeded).
As far as I can tell, setmember1d gets it right for the duplicate element
with the highest index, all other duplicates are found to be in the second
array, independent of whether or not that's actually true.

I found it easier to state my problem in terms of unique arrays rather than
trying to figure out a general solution, but countmembers sure is nice.

    Jan

---------- Forwarded message ----------
> From: Robert Cimrman <cimrman3 at ntc.zcu.cz>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Date: Thu, 25 Jan 2007 12:35:10 +0100
> Subject: Re: [Numpy-discussion] setmember1d memory leak?
> Robert Cimrman wrote:
> > Charles R Harris wrote:
> >>
> >> In [7]: def countmembers(a1, a2) :
> >>   ...:     a = sort(a2)
> >>   ...:     il = a.searchsorted(a1, side='l')
> >>   ...:     ir = a.searchsorted(a1, side='r')
> >>   ...:     return ir - il
> >>   ...:
> >> The subtraction can be replaced by != to get a boolean mask.
> >
> > It looks good! Isn't it faster than setmember1d for unique input arrays?
> > I do not like setmember1d much (it is long unlike other functions in
> > arraysetops and looks clumsy to me now and I do not understand it
> > anymore...), so feel free to replace it.
> >
> > BTW. setmember1d gives me the same mask as countmembers for several
> > non-unique inputs I tried...
>
> But still a function like 'findsorted' returning a bool mask would be
> handy - one searchsorted-like call could be saved in setmember1d.
>
> cheers,
> r.
>
>
>
>
> ---------- Forwarded message ----------
> From: rex <rex at nosyntax.com>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Date: Thu, 25 Jan 2007 03:50:24 -0800
> Subject: [Numpy-discussion] Compiling Python with icc
> George Nurser <gnurser at googlemail.com> [2007-01-25 02:05]:
>
> > Perhaps compiling python itself with icc might give a useful speedup.
> > Apparently somebody managed this for python 2.3 in 2003:
> > http://mail.python.org/pipermail/c++-sig/2003-October/005824.html
>
> Hello George,
>
> I saw that post yesterday, and just got around to trying it. It works.
>
> ./configure CC=icc --prefix=/usr/local
>
> In addition to commenting out
>
> #BASECFLAGS=     -OPT:Olimit=0
>
> I added
>
> -xT -parallel
>
> to the
>
> OPT=
>
> line for my Core 2 Duo CPU. The usual Make, Make install worked, and
> pybench now runs in 3.15 seconds vs 4.7 seconds with Python2.5 compiled
> with gcc. That's a 49% speed increase.
>
> http://svn.python.org/projects/external/pybench-2.0/
>
> And, if psyco is used, pybench runs in 1.6 seconds for one iteration and
> then crashes. Psyco + icc results in a ~300% speed increase. Pybench
> needs to be updated for 1+ gigaflop systems.
>
> http://psyco.sourceforge.net/
>
> -rex
> --
> "I have always wished that my computer would be as easy to use as my
> telephone. My wish has come true. I no longer know how to use my
> telephone"
>    --Bjorne Stroustrup (originator of C++ programming language)
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070125/3a8396b2/attachment.html>