Alternative to `enumerate` and `range(len(sequence))`: indexes() and entries()

It's very common to see: ``` for i, x in enumerate(sequence): [...] ``` and also to see ``` for i in range(len(sequence)): [...] ``` I propose to introduce for sequences the methods `indexes()` and `entries()`. They are similar to `keys()` and `items()` for maps. I changed the names for not doing a breaking change. This way the two codes will became simply: ``` for i, x in sequence.entries(): [...] for i in sequence.indexes(): [...] ``` sequence.indexes() could return: 1. a SequenceIndexesView, similar to the views of `dict` 2. a simple range() object The same for entries(). It should be: 1. a SequenceEntriesView 2. a generator that yields the tuples (index, value) What do you think about?

On Fri, Dec 27, 2019 at 1:23 PM Marco Sulla via Python-ideas <python-ideas@python.org> wrote:
I propose to introduce for sequences the methods `indexes()` and `entries()`.
"Sequence" is a protocol, not a class. Adding a method to sequences actually means mandating that everything that calls itself a sequence now has to implement this. Far better to create it as a stand-alone function - and for that, enumerate() is usually fine (esp since it handles arbitrary iterables, not just sequences). ChrisA

On Fri, Dec 27, 2019 at 02:22:48AM -0000, Marco Sulla via Python-ideas wrote:
Yes, that is very common, except that "sequence" can be any iterable with an unpredictable length, or even an infinite length, such as a generator: def gen(): while some_condition: yield some_value and enumerate will still work.
That, on the other hand, should not be common.
I propose to introduce for sequences the methods `indexes()` and `entries()`.
Hmm, well, that will cause a lot of code duplication. Just in the built-ins, we would have to add those two methods to: - dicts - dict views (keys, values and items) - sets - frozensets - lists - tuples - strings - bytes - bytearrays (did I miss any?). Even if the methods can share an implementation (probably calling enumerate and range behind the scenes) that's still over twenty additional methods to be added, documented, tested and maintained. What benefit does this change give us? for i, x in enumerate(items): # works with every iterable for i, x in items.entries(): # only works with some sequences So we save one character when typing. We don't simplify the language, we make it more complicated: def function(iterable): if hasattr(iterable, "entries"): entries = iterable.entries() else: entries = enumerate(iterable) for i, x in entries: ... So to save one character, we add four lines. Yeah, nobody is going to do that. They'll just use the technique that works on all iterables, not the one that works on only some of them. for i in range(len(sequence)): for i in sequence.indexes(): Here we save two characters and one function call. The cost is, every sequence type has to implement the method. I don't see enough benefit to force every sequence to implement two new methods. -- Steven

Steven D'Aprano wrote:
There's no problem. Also iterators and generators can have a entries() method that returns a view, or a generator.
Yes and no: you added too much! dict views, set and frozenset are not sequences, and furthermore does not require such methods. You missed memoryview. About maps, well... wait and read ^^ And even if they are not sequences, I think also file-like objects can benefit from an entries() method: ``` with open(somepath) as f: for i, line in f.entries(): [...] ```
. Even if the methods can share an implementation (probably calling enumerate and range behind the scenes)
This can be a fast and furious implementation for a first draft, but IMHO they should return a view, like for maps.
for i, x in enumerate(items): # works with every iterable for i, x in items.entries(): # only works with some sequences
Well, enumerate does work with **almost** every iterable. It does not work with maps, you have to use keys(). You can say <<you can still use enumerate with a map>>, but for another purpose. Currently, if you want to iterate over a map items or over a sequence, iterable, generator and file entries, you already have to use two different APIs. Maybe we can simply add an alias for keys() and entries() for maps, so we have really them for all iterables. Furthermore, there's also an unexpected bonus for enumerate: enumerate could try to return entries() for not maps as first action. I don't know if enumerate is written in py or in c, but if it's written in py, this will be boost enumerate. IMHO, indexes() could be simply added to all sequences without problems. entries(), since it can be applied to not sequences too, IMHO requires another CPython type: Enumerable. Who inherits also Enumerable must implement an entries() method. This could be also an ABC class.

On 12/27/19 10:16 AM, Marco Sulla via Python-ideas wrote:
I'm not sure what you mean by this. enumerate works fine for dicts: >>> list(enumerate({'a':1, 'b':2})) [(0, 'a'), (1, 'b')] I think there's an overarching philosophy here: Python generally doesn't offer two names for the same operation. Is this expressed anywhere? For example, list has .pop(), but it doesn't have .push(). Why? Because .append() is for pushing when using the list as a stack. --Ned.

On Fri, Dec 27, 2019 at 03:16:50PM -0000, Marco Sulla via Python-ideas wrote:
Of course it's a problem. Who is going to go out and force developers of a thousand different libraries and classes to add your two methods to their classes? It's no problem for *you* because you aren't doing the work. You are asking them to do more work, for no benefit. Your two methods adds no new functionality beyond what is already supplied by enumerate.
Also iterators and generators can have a entries() method
The is no such type as "iterator" and no central place to add the method. Iterator is defined by a protocol: any object with a `__next__` method and an `__iter__` method that returns self is an iterator. It is an intentionally simple protocol. Forcing iterators to also include an extra two methods just adds unnecessary complexity to the protocol. It means that everyone who writes an iterator class will be forced to support your two methods, even if they have no need for them. And nobody has need for them, because they do nothing that enumerate doesn't already do. Your entries method is just an alternative spelling for enumerate().
Yes and no: you added too much! dict views, set and frozenset are not sequences
Doesn't matter. We can iterate over dict views and sets, and we can enumerate *any* iterable. for i, x in enumerate(obj) works perfectly well when obj is a set, a dict view, an iterator, a sequence, a dict, an open file, or any other iterable object. That's the point: enumerate works with all iterables, without exception. Your extra methods do not. Why would anybody use your methods instead of the idiom that is backwards and forwards compatible, and that works with all iterables, not just a few of them? Yes, *all* iterables, including mappings like dicts: py> for i, x in enumerate({'a': 10, 'b': 20, 'c': 30}): ... print(i, x) ... 0 a 1 b 2 c
Well, enumerate does work with **almost** every iterable. It does not work with maps, you have to use keys().
You don't have to use keys(). [...]
What benefit do you get from having the indexes be in a view?
That's incorrect. The for-loop API for target in iterable works perfectly well for all iterables, including mappings, dict views, sequences, containers, iterators, generators, files, strings and arrays. -- Steven

You want Lua. and pairs(). I have my own pairs() implementation as part of my library. It handles the following: - Sequence becomes enumerate(seq) - Mapping becomes seq.items() - Set becomes ((x,x) for x in set) - Everything else is an error. (Arguably Set should be (x, None) or (x, True) or any number of things, but I like the semantics of (x, x) personally.) But I don't care too much about it being part of python. (altho it'd be nice if you could do set[x] to get the original x out of the set, for purposes of interning and uhh "hacks".) On 2019-12-27 12:16 p.m., Marco Sulla via Python-ideas wrote:

On Fri, Dec 27, 2019 at 1:23 PM Marco Sulla via Python-ideas <python-ideas@python.org> wrote:
I propose to introduce for sequences the methods `indexes()` and `entries()`.
"Sequence" is a protocol, not a class. Adding a method to sequences actually means mandating that everything that calls itself a sequence now has to implement this. Far better to create it as a stand-alone function - and for that, enumerate() is usually fine (esp since it handles arbitrary iterables, not just sequences). ChrisA

On Fri, Dec 27, 2019 at 02:22:48AM -0000, Marco Sulla via Python-ideas wrote:
Yes, that is very common, except that "sequence" can be any iterable with an unpredictable length, or even an infinite length, such as a generator: def gen(): while some_condition: yield some_value and enumerate will still work.
That, on the other hand, should not be common.
I propose to introduce for sequences the methods `indexes()` and `entries()`.
Hmm, well, that will cause a lot of code duplication. Just in the built-ins, we would have to add those two methods to: - dicts - dict views (keys, values and items) - sets - frozensets - lists - tuples - strings - bytes - bytearrays (did I miss any?). Even if the methods can share an implementation (probably calling enumerate and range behind the scenes) that's still over twenty additional methods to be added, documented, tested and maintained. What benefit does this change give us? for i, x in enumerate(items): # works with every iterable for i, x in items.entries(): # only works with some sequences So we save one character when typing. We don't simplify the language, we make it more complicated: def function(iterable): if hasattr(iterable, "entries"): entries = iterable.entries() else: entries = enumerate(iterable) for i, x in entries: ... So to save one character, we add four lines. Yeah, nobody is going to do that. They'll just use the technique that works on all iterables, not the one that works on only some of them. for i in range(len(sequence)): for i in sequence.indexes(): Here we save two characters and one function call. The cost is, every sequence type has to implement the method. I don't see enough benefit to force every sequence to implement two new methods. -- Steven

Steven D'Aprano wrote:
There's no problem. Also iterators and generators can have a entries() method that returns a view, or a generator.
Yes and no: you added too much! dict views, set and frozenset are not sequences, and furthermore does not require such methods. You missed memoryview. About maps, well... wait and read ^^ And even if they are not sequences, I think also file-like objects can benefit from an entries() method: ``` with open(somepath) as f: for i, line in f.entries(): [...] ```
. Even if the methods can share an implementation (probably calling enumerate and range behind the scenes)
This can be a fast and furious implementation for a first draft, but IMHO they should return a view, like for maps.
for i, x in enumerate(items): # works with every iterable for i, x in items.entries(): # only works with some sequences
Well, enumerate does work with **almost** every iterable. It does not work with maps, you have to use keys(). You can say <<you can still use enumerate with a map>>, but for another purpose. Currently, if you want to iterate over a map items or over a sequence, iterable, generator and file entries, you already have to use two different APIs. Maybe we can simply add an alias for keys() and entries() for maps, so we have really them for all iterables. Furthermore, there's also an unexpected bonus for enumerate: enumerate could try to return entries() for not maps as first action. I don't know if enumerate is written in py or in c, but if it's written in py, this will be boost enumerate. IMHO, indexes() could be simply added to all sequences without problems. entries(), since it can be applied to not sequences too, IMHO requires another CPython type: Enumerable. Who inherits also Enumerable must implement an entries() method. This could be also an ABC class.

On 12/27/19 10:16 AM, Marco Sulla via Python-ideas wrote:
I'm not sure what you mean by this. enumerate works fine for dicts: >>> list(enumerate({'a':1, 'b':2})) [(0, 'a'), (1, 'b')] I think there's an overarching philosophy here: Python generally doesn't offer two names for the same operation. Is this expressed anywhere? For example, list has .pop(), but it doesn't have .push(). Why? Because .append() is for pushing when using the list as a stack. --Ned.

On Fri, Dec 27, 2019 at 03:16:50PM -0000, Marco Sulla via Python-ideas wrote:
Of course it's a problem. Who is going to go out and force developers of a thousand different libraries and classes to add your two methods to their classes? It's no problem for *you* because you aren't doing the work. You are asking them to do more work, for no benefit. Your two methods adds no new functionality beyond what is already supplied by enumerate.
Also iterators and generators can have a entries() method
The is no such type as "iterator" and no central place to add the method. Iterator is defined by a protocol: any object with a `__next__` method and an `__iter__` method that returns self is an iterator. It is an intentionally simple protocol. Forcing iterators to also include an extra two methods just adds unnecessary complexity to the protocol. It means that everyone who writes an iterator class will be forced to support your two methods, even if they have no need for them. And nobody has need for them, because they do nothing that enumerate doesn't already do. Your entries method is just an alternative spelling for enumerate().
Yes and no: you added too much! dict views, set and frozenset are not sequences
Doesn't matter. We can iterate over dict views and sets, and we can enumerate *any* iterable. for i, x in enumerate(obj) works perfectly well when obj is a set, a dict view, an iterator, a sequence, a dict, an open file, or any other iterable object. That's the point: enumerate works with all iterables, without exception. Your extra methods do not. Why would anybody use your methods instead of the idiom that is backwards and forwards compatible, and that works with all iterables, not just a few of them? Yes, *all* iterables, including mappings like dicts: py> for i, x in enumerate({'a': 10, 'b': 20, 'c': 30}): ... print(i, x) ... 0 a 1 b 2 c
Well, enumerate does work with **almost** every iterable. It does not work with maps, you have to use keys().
You don't have to use keys(). [...]
What benefit do you get from having the indexes be in a view?
That's incorrect. The for-loop API for target in iterable works perfectly well for all iterables, including mappings, dict views, sequences, containers, iterators, generators, files, strings and arrays. -- Steven

You want Lua. and pairs(). I have my own pairs() implementation as part of my library. It handles the following: - Sequence becomes enumerate(seq) - Mapping becomes seq.items() - Set becomes ((x,x) for x in set) - Everything else is an error. (Arguably Set should be (x, None) or (x, True) or any number of things, but I like the semantics of (x, x) personally.) But I don't care too much about it being part of python. (altho it'd be nice if you could do set[x] to get the original x out of the set, for purposes of interning and uhh "hacks".) On 2019-12-27 12:16 p.m., Marco Sulla via Python-ideas wrote:
participants (5)
-
Chris Angelico
-
Marco Sulla
-
Ned Batchelder
-
Soni L.
-
Steven D'Aprano