list.index() extension

I would like to propose a extra parameter `predicate` to list.index() like this: def index(self, obj, predicate=operator.eq): for idx, item in enumerate(self): if predicate(item, obj): return idx raise IndexError My main use-case is 2to3 where a node has to locate its self in the parents node list by identity. Instead of writing out the search manually as is done now, it would be nice to just write `self.parent.nodes.index(self, operator.is_)`. I can also imagine this might be useful: print "The first number less than 5 in this list is %s" % (my_list.index(5, operator.lt),)

On Sat, Apr 4, 2009 at 5:38 PM, Benjamin Peterson <benjamin@python.org> wrote:
print "The first number less than 5 in this list is my_list[%d]=%s" % ((idx, elt) for idx, elt in enumerate(my_list) if elt < 5).next() Okay, it's sort of ugly and rubyish, but I think it solves your case sufficiently that we don't need to change index(). If you can come up with a more pressing reason though, I'm all ears (and fingers, evidently). -- Cheers, Leif

On 05Apr2009 11:36, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Isn't it trivially built on top of itertools.takewhile? -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/

On Sat, Apr 4, 2009 at 7:51 PM, Benjamin Peterson <benjamin@python.org> wrote:
There's `next(itertools.ifilter(some_list, lambda x: x is my_obj))`, but that returns the object and not the index as I want.
Maybe you didn't understand what I meant. Try the following code, with your favorite list and object. I promise it works. try: print "The index of the first element that 'is obj' is %d." % (idx for idx, elt in enumerate(lst) if elt is obj).next() except StopIteration: print "obj is not in lst" -- Cheers, Leif

Le Sun, 5 Apr 2009 00:39:24 +0000 (UTC), Benjamin Peterson <benjamin@python.org> s'exprima ainsi:
An issue is imo that (idx for idx, elt in enumerate(lst) if elt is obj) builds a generator able to yield every indexes of elements that satisfy the condition; while all you want is the first one (as expressed by ".next()"). On one hand, I have very few use cases for such a "conditional find". One the other hand, it's simple, practicle and consistent. Adding an optional parameter, with '==' as default, as originally proposed by Benjamin, does not harm in the common case and does not break any exiting code. In case it would be accepted, then rather in builtin find and count methods for I see this as a semantic extension, not a fully different feature. +0.5 Denis ------ la vita e estrany

On Sun, Apr 5, 2009 at 4:01 AM, spir <denis.spir@free.fr> wrote:
An issue is imo that (idx for idx, elt in enumerate(lst) if elt is obj) builds a generator able to yield every indexes of elements that satisfy the condition; while all you want is the first one (as expressed by ".next()").
It appears you're right, but the pain is only felt when the item is very close to the front of the list (and note that the ~2s penalty is over a million runs). I've attached my benchmark, and the results are below. For a quick idea of what I did, I made a simple global find(lst, obj) that has the 'is' condition hard-coded (for my own sanity). I also used the generator example I posted above. Each version was asked to find the last element of a list. Times were averaged over 5 runs for each size, and the number of calls varied inversely to the size of the list (just so my computer doesn't burn out). """ Running tests, each output in secs, average of 5 calls. 10 item list, 1000000 calls: gen: 5.534338 fun: 3.620335 100 item list, 100000 calls: gen: 2.331217 fun: 2.258001 1000 item list, 10000 calls: gen: 1.842553 fun: 2.014453 10000 item list, 1000 calls: gen: 1.760938 fun: 1.925302 100000 item list, 100 calls: gen: 1.783202 fun: 1.921334 1000000 item list, 10 calls: gen: 1.798051 fun: 1.955751 """ -- Cheers, Leif

On Sun, 5 Apr 2009 09:36:51 am Greg Ewing wrote:
Not to me. Surely itertools is for functions which return iterators, not arbitrary functions that take an iterable argument? It certainly does look like a useful function to have, but I don't know where in the standard library it should go, given that Guido dislikes grab-bag modules of miscellaneous functions. To my mind, that makes it a candidate to become a list/tuple/str method. I like Christian's suggestion. I've often wished for a built-in way to do element testing by identity instead of equality, and by "often" I mean occasionally. So I'm +1 on the proposal. -- Steven D'Aprano

On Sat, Apr 04, 2009, Benjamin Peterson wrote:
-1 -- it complicates the documentation too much for a small feature. Your function looks just fine, and I see no reason to add a new method. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." --Brian W. Kernighan

On Sat, Apr 4, 2009 at 6:55 PM, Aahz <aahz@pythoncraft.com> wrote:
Ditto. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Benjamin Peterson schrieb:
-1. The list API is the prototype of all mutable sequence APIs (as now fixed in the corresponding abc class). It is meant to be simple and easy to understand. Adding features to it should be done very carefully, and this seems like a random one of ten similar features that could be added. There's nothing wrong with a three-line helper function, or a list subclass that has the additional method you're looking for. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Sat, Apr 4, 2009 at 11:38 PM, Benjamin Peterson <benjamin@python.org> wrote:
It would be more natural to pass a single predicate function than an argument and a binary operator, as this is more general. I wouldn't mind predicate-based index and count somewhere in the standard library. Meanwhile, here is yet another solution (although not so efficient). class match: def __init__(self, predicate): self.__eq__ = predicate range(10,-10,-1).index(match(lambda x: x < 5)) range(10,-10,-1).count(match(lambda x: x < 5)) Fredrik

Fredrik Johansson schrieb:
Or we could finally give in and realize the need for fully generalized list operations, which were already proposed over 10 years ago. Sadly, the idea was never followed through and implemented: <http://tinyurl.com/d8xwee>. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Fredrik Johansson wrote:
I find that sort of elegant, but it won't work with new style classes as written. It needs to have the __eq__ on the class instead of on the instance:
Also, the examples given before using .next() look slightly less bad when written with the 2.6/3.0 next function: index = next(i for i, v in enumerate(range(-10, 10)) if v is 5) -- Carl Johnson

On Sat, Apr 4, 2009 at 5:38 PM, Benjamin Peterson <benjamin@python.org> wrote:
print "The first number less than 5 in this list is my_list[%d]=%s" % ((idx, elt) for idx, elt in enumerate(my_list) if elt < 5).next() Okay, it's sort of ugly and rubyish, but I think it solves your case sufficiently that we don't need to change index(). If you can come up with a more pressing reason though, I'm all ears (and fingers, evidently). -- Cheers, Leif

On 05Apr2009 11:36, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Isn't it trivially built on top of itertools.takewhile? -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/

On Sat, Apr 4, 2009 at 7:51 PM, Benjamin Peterson <benjamin@python.org> wrote:
There's `next(itertools.ifilter(some_list, lambda x: x is my_obj))`, but that returns the object and not the index as I want.
Maybe you didn't understand what I meant. Try the following code, with your favorite list and object. I promise it works. try: print "The index of the first element that 'is obj' is %d." % (idx for idx, elt in enumerate(lst) if elt is obj).next() except StopIteration: print "obj is not in lst" -- Cheers, Leif

Le Sun, 5 Apr 2009 00:39:24 +0000 (UTC), Benjamin Peterson <benjamin@python.org> s'exprima ainsi:
An issue is imo that (idx for idx, elt in enumerate(lst) if elt is obj) builds a generator able to yield every indexes of elements that satisfy the condition; while all you want is the first one (as expressed by ".next()"). On one hand, I have very few use cases for such a "conditional find". One the other hand, it's simple, practicle and consistent. Adding an optional parameter, with '==' as default, as originally proposed by Benjamin, does not harm in the common case and does not break any exiting code. In case it would be accepted, then rather in builtin find and count methods for I see this as a semantic extension, not a fully different feature. +0.5 Denis ------ la vita e estrany

On Sun, Apr 5, 2009 at 4:01 AM, spir <denis.spir@free.fr> wrote:
An issue is imo that (idx for idx, elt in enumerate(lst) if elt is obj) builds a generator able to yield every indexes of elements that satisfy the condition; while all you want is the first one (as expressed by ".next()").
It appears you're right, but the pain is only felt when the item is very close to the front of the list (and note that the ~2s penalty is over a million runs). I've attached my benchmark, and the results are below. For a quick idea of what I did, I made a simple global find(lst, obj) that has the 'is' condition hard-coded (for my own sanity). I also used the generator example I posted above. Each version was asked to find the last element of a list. Times were averaged over 5 runs for each size, and the number of calls varied inversely to the size of the list (just so my computer doesn't burn out). """ Running tests, each output in secs, average of 5 calls. 10 item list, 1000000 calls: gen: 5.534338 fun: 3.620335 100 item list, 100000 calls: gen: 2.331217 fun: 2.258001 1000 item list, 10000 calls: gen: 1.842553 fun: 2.014453 10000 item list, 1000 calls: gen: 1.760938 fun: 1.925302 100000 item list, 100 calls: gen: 1.783202 fun: 1.921334 1000000 item list, 10 calls: gen: 1.798051 fun: 1.955751 """ -- Cheers, Leif

On Sun, 5 Apr 2009 09:36:51 am Greg Ewing wrote:
Not to me. Surely itertools is for functions which return iterators, not arbitrary functions that take an iterable argument? It certainly does look like a useful function to have, but I don't know where in the standard library it should go, given that Guido dislikes grab-bag modules of miscellaneous functions. To my mind, that makes it a candidate to become a list/tuple/str method. I like Christian's suggestion. I've often wished for a built-in way to do element testing by identity instead of equality, and by "often" I mean occasionally. So I'm +1 on the proposal. -- Steven D'Aprano

On Sat, Apr 04, 2009, Benjamin Peterson wrote:
-1 -- it complicates the documentation too much for a small feature. Your function looks just fine, and I see no reason to add a new method. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." --Brian W. Kernighan

On Sat, Apr 4, 2009 at 6:55 PM, Aahz <aahz@pythoncraft.com> wrote:
Ditto. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Benjamin Peterson schrieb:
-1. The list API is the prototype of all mutable sequence APIs (as now fixed in the corresponding abc class). It is meant to be simple and easy to understand. Adding features to it should be done very carefully, and this seems like a random one of ten similar features that could be added. There's nothing wrong with a three-line helper function, or a list subclass that has the additional method you're looking for. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Sat, Apr 4, 2009 at 11:38 PM, Benjamin Peterson <benjamin@python.org> wrote:
It would be more natural to pass a single predicate function than an argument and a binary operator, as this is more general. I wouldn't mind predicate-based index and count somewhere in the standard library. Meanwhile, here is yet another solution (although not so efficient). class match: def __init__(self, predicate): self.__eq__ = predicate range(10,-10,-1).index(match(lambda x: x < 5)) range(10,-10,-1).count(match(lambda x: x < 5)) Fredrik

Fredrik Johansson schrieb:
Or we could finally give in and realize the need for fully generalized list operations, which were already proposed over 10 years ago. Sadly, the idea was never followed through and implemented: <http://tinyurl.com/d8xwee>. cheers, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Fredrik Johansson wrote:
I find that sort of elegant, but it won't work with new style classes as written. It needs to have the __eq__ on the class instead of on the instance:
Also, the examples given before using .next() look slightly less bad when written with the 2.6/3.0 next function: index = next(i for i, v in enumerate(range(-10, 10)) if v is 5) -- Carl Johnson
participants (12)
-
Aahz
-
Benjamin Peterson
-
Cameron Simpson
-
Carl Johnson
-
Christian Heimes
-
Fredrik Johansson
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Leif Walsh
-
spir
-
Steven D'Aprano