Mailman 3 fancy indexing - Python-ideas

20 Jul 2010

      [changing the subject; was: 'where' statement in Python?]

I think this is an interesting idea (whether worth adding is a different
question). I think it would be confusing that
   a[x] = (y,z)
does something entirely different when x is 1 or (1,2). If python *were* to
add something like this, I think perhaps a different syntax should be
considered:

a[[x]] = y
y = a[[x]]

which call __setitems__ and __getitems__ respectively. This makes it clear
that something different is going on and eliminates the ambiguity for dicts.

--- Bruce
http://www.vroospeak.com
http://google-gruyere.appspot.com

On Tue, Jul 20, 2010 at 10:18 AM, Sturla Molden  wrote:
...
So I'd rather speak of something useful instead: NumPy's "Fancy indexing".
"Fancy indexing" (NumPy jargon) will in this context mean that we allow
indexes to be an iterable, not just integers:
mylist[(1,2,3)] ==  mylist[1,2,3]
  mylist[iterable] == [a(i) for i in iterable]
That is what NumPy and Matlab do, as well as Fortran 90 (and certain C++
libraries such as Blitz++). It has all the power of the "where keyword",
while being more flexible to use, and intention is more explicit. It is also
well tested syntax.
Thus with "fancy indexing":
alist[iterable] == [alist[i] for i in iterable]
That is what we really need!
Note that this is not a language syntax change, it is just a change of how
__setitem__ and __getitem__ works for certain container types. NumPy already
does this, so the syntax itself is completely valid Python. And as for
"where", it is just a function.
Andrey's proposed where keyword is a crippled tool in comparison. That is,
the real power of a list of indexers is that it can be obtained and
manipulated with any conceivable method, e.g. slicing. It also allows numpy
to have an "argsort" function, since an index list can be reused on multiple
arrays:
idx = np.argsort(array_a)
  sorteda  = array_a[idx]
  sortedb = array_b[idx]
is the same as
tmp = sorted([a,i for i,a in enumerate(lista)])
  sorteda = [a for a,i in tmp]
  sortedb = [listb[i] for a,i in tmp]
Which is the more readable?
Implementing a generic "where function" can be achieved with a lambda:
idx = where(lambda x:  x== 47, alist)
or a list comprehension (this would be very similar to NumPy):
idx = where([x==47 for x in alist])
But to begin with, I think we should get NumPy style "fancy indexing" to
standard container types like list, tuple, string, bytes, bytearray, array
and deque. That would just be a handful of subclasses, and I think they
should (initially) be put in a standard library module, and possibly replace
the current cointainers in Python 4000.
But as for a where keyword: My opinion is a big -1, if  I have the right to
vote. We should rather implement a where function and overload the mentioned
container types. The where function should go in the same module.
So all in all, I am +1 for a "where module" and -1 for a "where keyword".
P.S. I'll admit that dict and set might add to some confusion, since "fancy
indexing" would be ambigous for them.
Regards,
Sturla
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
http://mail.python.org/mailman/listinfo/python-ideas

fancy indexing

Bruce Leban

Sturla Molden

Mathias Panzenböck

Carl M. Johnson

Sturla Molden

Bruce Leban

Chris Rebert

Sturla Molden

tags

participants (5)