[Matrix-SIG] [PSA MEMBERS] Numeric Python Set function

David Ascher da@skivs.ski.org
Mon, 8 Jun 1998 10:21:32 -0700 (PDT)


On Fri, 5 Jun 1998, Paul F. Dubois wrote:

> Since it is nice to have a positive answer to a question, I want to make
> sure Zane's answer reaches the matrix-sig. Please excuse if you already saw
> it.
> 
> Is this function, or others in arrayfns, something we should move into NumPy
> proper? I think array_set does sound like an important function.

I don't remember the exact specifics of Zane's function.  Something like
it needs to be incorporated not only in NumPy, but in the *indexing*
(setitem) mechanism.  I've done some preliminary work on this, and there
are a couple of non-trivial issues -- specifically, it'd be nice to be
able to do: 

   a[a>100] = 100

as well as a more general form,

   a[b] = c

where b contains some description of the indices of a which need to get
their values from c.

Note that a simplistic implementation will act strangely for at least one
of these under some conditions (since the first index (a>100) corresponds
(or will, someday) to an array of 1's and 0's.  Replace the RHS of the
first example with an array, and you have an ambiguity).  There are ways
around this, which, I believe, localize the complexity to the array
object.  I've been playing with one way to deal with this, which is
basically to usurp the tp_call/__call__ slots, just because they were
there (and because Don Beaudry has, as we know, a twisted mind). Upon
further reflection, I think that coming up with a specialized new slot
(for arrays and arraylike instances) is the right thing to do.

The nice thing about this approach is that arrays of different species can
define different ways to do the indexing (thus arrays which correspond to
logical operations on arrays would "know" that they are masks, whereas
arrays which are returned by other functions would know that they are
indices, etc). It also means that one could have a version of NumPy which
just provides the hooks for this, and various folk can propose specific
mechanisms (see the old debate on S+ vs.  APL vs. etc. indexing).

I'd hoped to post about this when I had a proof-of-concept finished. 
Sadly, that task is on my stack under the rich comparisons, which are
under a fair amount of other stuff.  It's nice to see that there is a
constituency out there, though. =)

--david

PS: Folks should also definitely remember Chris Chase's effort, which is a
    Python-only solution (hence lacked the speed I often need), but
    defined a fuller set of features (ellipses, NewAxis, etc.).  It's
    available on the findmail archives.  I don't remember the exact name
    of the file, but I can resurrect it if it's hard to find.  I vaguely
    remember some discussion where Zane was saying that he didn't
    implement all of these (which I think would be nice to have, at least
    in the final version).