Syntax for key-value iteration over mappings
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
Hello, Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view:: for key, value in a_dict.items(): print(key, value) I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often. Every time I do, it seems it's boilerplate; it looks a like a workaround rather than a preferred way of doing things. In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well? for key: value in a_dict: print(key, value) I argue that to anyone familiar with dict literals, let alone dict comprehensions, the semantics of this loop should be pretty obvious. In dict comprehensions, similarity to existing syntax becomes even more clear: a_mapping = {1: 'one', 2: 'two'} inverse = {val: key for key: val in a_mapping} I've bounced this idea off a few EuroPython sprinters, and got some questions/concerns I can answer here: * But, the colon is supposed to start a block! Well, it's already used in dict comprehensions/literals (though it's true that there it's always inside brackets). And in lambdas – Here's code that is legal today (though not very practical): while lambda: True: break * There's supposed to be only one obvious way to do it! We alredy have .items()! I don't think this stops us from adding a new way of doing things which is more obvious than the old, and which should become the one way. After all, you don't say "for key in mapping.keys():", even though the keys() method exists. * What exactly would it do? There are multiple options — - loop over .keys() and use __getitem__ each time, like the dict() constructor? - loop over .items(), like most of the code used today? - become a well-specified "key/value iteration protocol" with __iteritems__() and its own bytecode operation? — but here I'm asking if building this bikeshed sounds useful, rather than what paint to buy. That said, I do have a proof of concept implementation of the second option, in case you'd like to play around with this: Github: https://github.com/encukou/cpython/tree/keyval-iteration patch: https://github.com/encukou/cpython/commit/b9b0d973342280f0ef52e26a4b67f326ec...
data:image/s3,"s3://crabby-images/291c0/291c0867ef7713a6edb609517b347604a575bf5e" alt=""
I'd love that because I find .items() quite cumbersome as well if I have to use it. I'd like to know if there were some reason not to introduce this in the first place. Best, Sven On 26.07.2015 18:09, Petr Viktorin wrote:
Hello, Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often. Every time I do, it seems it's boilerplate; it looks a like a workaround rather than a preferred way of doing things.
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
I argue that to anyone familiar with dict literals, let alone dict comprehensions, the semantics of this loop should be pretty obvious. In dict comprehensions, similarity to existing syntax becomes even more clear:
a_mapping = {1: 'one', 2: 'two'} inverse = {val: key for key: val in a_mapping}
I've bounced this idea off a few EuroPython sprinters, and got some questions/concerns I can answer here:
* But, the colon is supposed to start a block!
Well, it's already used in dict comprehensions/literals (though it's true that there it's always inside brackets). And in lambdas – Here's code that is legal today (though not very practical):
while lambda: True: break
* There's supposed to be only one obvious way to do it! We alredy have .items()!
I don't think this stops us from adding a new way of doing things which is more obvious than the old, and which should become the one way. After all, you don't say "for key in mapping.keys():", even though the keys() method exists.
* What exactly would it do?
There are multiple options — - loop over .keys() and use __getitem__ each time, like the dict() constructor? - loop over .items(), like most of the code used today? - become a well-specified "key/value iteration protocol" with __iteritems__() and its own bytecode operation?
— but here I'm asking if building this bikeshed sounds useful, rather than what paint to buy.
That said, I do have a proof of concept implementation of the second option, in case you'd like to play around with this: Github: https://github.com/encukou/cpython/tree/keyval-iteration patch: https://github.com/encukou/cpython/commit/b9b0d973342280f0ef52e26a4b67f326ec... _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/efe10/efe107798b959240e12a33a55e62a713508452f0" alt=""
Cool suggestion, but I prefer how things are. (As an aside, calling getitem each time is not efficient.) On Sunday, July 26, 2015 at 12:10:00 PM UTC-4, Petr Viktorin wrote: Hello,
Currently, the way to iterate over keys and values of a mapping is to call items () and iterate over the Resulting view ::
for key, value in A_dict.items (): print (key, value)
I believe That looping over all the data in a Dict is a very imporant operation, and I find myself writing this quite Often. Every time I do, it Seems it's boilerplate; it looks a like a workaround rather than a preferred way of doing things.
Dict in Comprehensions and Literals, key-value pairs are separated by Colons. How about allowing that in for loops as well?
for key: value in A_dict: print (key, value)
That I argue to anyone familiar with Dict Literals, let alone Dict Comprehensions, the semantics of this loop shouldnt be pretty Obvious. In Dict Comprehensions, Similarity to Existing syntax becomes even more clear:
A_mapping = {1: 'one', 2: 'two'} = {val inverse: key for key: val in A_mapping}
I've bounced this idea off a few EuroPython Sprinters, and got some questions / Concerns I can answer here:
* But, the colon is supposed to start a block!
Well, it's already used in Dict Comprehensions / Literals (though it's true That there it's always inside brackets). And in lambdas - Here's That code is legal today (though not very practical):
the while lambda: True: break
* There's supposed to be only one obvious way to do it! We alredy have .items ()!
I do not think this stops us from adding a new way of doing things Which is more than the Obvious old, and Which shouldnt become one the way. After all, you do not say "for key in Mapping.keys (): ", even though the keys () method exists.
* What exactly would it do?
There are multiple options - - loop over .keys () and use __getitem__ each time, like the Dict () constructor? - loop over .items (), like most of the code used today? - become a well-specified "key / value iteration protocol "with __iteritems __ () and its own bytecode operation?
- But here I'm asking if this building Bikeshed sounds useful, rather than what to buy paint.
That said, I do have a proof of concept Implementation of the second option, in case you'd like to play around with mailing list Python ... @ Python.org <javascript:>Https://mail.python.org/ <https://mail.python.org/mailman/listinfo/python-ideas>mailman / Listinfo / python-ideas <https://mail.python.org/mailman/listinfo/python-ideas> Code of Conduct: Http://python.org/psf/ <http://python.org/psf/codeofconduct/>Codeofconduct / <http://python.org/psf/codeofconduct/> <https://github.com/encukou/cpython/tree/keyval-iteration>
<https://github.com/encukou/cpython/commit/b9b0d973342280f0ef52e26a4b67f326ec...>
<javascript:> <https://mail.python.org/mailman/listinfo/python-ideas> <http://python.org/psf/codeofconduct/>
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 27, 2015 at 2:56 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
Cool suggestion, but I prefer how things are.
(As an aside, calling getitem each time is not efficient.)
It is. But, that's what dict.update() does for mappings. With a dedicated key-value iteration protocol, that could be sped up :)
data:image/s3,"s3://crabby-images/552f9/552f93297bac074f42414baecc3ef3063050ba29" alt=""
Hello, Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often. Every time I do, it seems it's boilerplate; it looks a like a workaround rather than a preferred way of doing things.
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
I argue that to anyone familiar with dict literals, let alone dict comprehensions, the semantics of this loop should be pretty obvious. In dict comprehensions, similarity to existing syntax becomes even more clear:
a_mapping = {1: 'one', 2: 'two'} inverse = {val: key for key: val in a_mapping}
I've bounced this idea off a few EuroPython sprinters, and got some questions/concerns I can answer here:
* But, the colon is supposed to start a block!
Well, it's already used in dict comprehensions/literals (though it's true that there it's always inside brackets). And in lambdas – Here's code that is legal today (though not very practical):
while lambda: True: break
* There's supposed to be only one obvious way to do it! We alredy have .items()!
I don't think this stops us from adding a new way of doing things which is more obvious than the old, and which should become the one way. After all, you don't say "for key in mapping.keys():", even though the keys() method exists. You just might, if you modified the dictionary in the loop body, and you wanted to process the original list of keys but didn't need to remember
On 26/07/2015 17:09, Petr Viktorin wrote: the original values and wanted to avoid the overhead of copying the values.
* What exactly would it do?
There are multiple options — - loop over .keys() and use __getitem__ each time, like the dict() constructor? - loop over .items(), like most of the code used today? - become a well-specified "key/value iteration protocol" with __iteritems__() and its own bytecode operation?
— but here I'm asking if building this bikeshed sounds useful, rather than what paint to buy.
I like it! It seems so intuitive that, like Sven, I wonder why it's not already in the language. As far as I can see it doesn't introduce any ambiguities. I am thinking of code such as for k,j : x,(y,z), in complicated_expression: I would guess (from a position of complete ignorance) that there would be no *insuperable* difficulty in parsing this. I suggest (without feeling strongly about it) that optional parentheses should be allowed for stylistic reasons, i.e. for ( k : v ) in a_dict: [I thought about for { k : v } in a_dict: before I realised that this is currently legal, albeit (probably) nonsensical, syntax. [Python 2.7.3]] One downside: Whatever implementation is chosen, it will not be "the one obvious way to do it". E.g. an .iteritems()-like implementation will fail if the dictionary is modified during the loop. an .items()-like implementation will be expensive on a huge dictionary. As there are already several ways of iterating over a dictionary, I think the new construct should be semantically equivalent to one of the existing ways, so that we don't have yet another behaviour to learn. My bikeshed colour is that it be equivalent to using .items(), as I think this is least likely to trip up newbies (it won't raise an error if the dictionary is modified); YMMV. Rob Cliffe
That said, I do have a proof of concept implementation of the second option, in case you'd like to play around with this: Github: https://github.com/encukou/cpython/tree/keyval-iteration patch: https://github.com/encukou/cpython/commit/b9b0d973342280f0ef52e26a4b67f326ec... _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
----- No virus found in this message. Checked by AVG - www.avg.com Version: 2014.0.4821 / Virus Database: 4365/10312 - Release Date: 07/26/15
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 27, 2015, at 03:52, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
One downside: Whatever implementation is chosen, it will not be "the one obvious way to do it". E.g. an .iteritems()-like implementation will fail if the dictionary is modified during the loop. an .items()-like implementation will be expensive on a huge dictionary.
Why? The items method returns a view, an object that's backed by the dict itself. There is a bit of overhead, but it's constant, not linear on the dict size. You may be thinking of 2.7, where items creates a list of pairs. In 3.x, it's equivalent to the 2.7 viewitems, not the 2.7 items.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Jul 26, 2015 at 06:09:17PM +0200, Petr Viktorin wrote:
Hello, Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often. Every time I do, it seems it's boilerplate;
What part looks like boilerplate? The "for"? The "key,value"? The "in"? The "a_dict"? If none of them are boilerplate, why would ".items()" be boilerplate?
it looks a like a workaround rather than a preferred way of doing things.
A work-around for what? It can't be "work-around for lack of a way to get the (key,value) pairs from a dict", because the items() method *is* the preferred way to get the (key,value) pairs from a dict, and has been since Python 1.5 or even older. I don't think that describing an explicit call to items() method as "boilerplate" or "a work-around" can be justified. If it is either, then the terms are so meaningless that they could be applied to anything at all.
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
A very strong -1 to this. It's ugly and unattractive. "for x:" looks like the end of a statement, not the beginning of one. Yes, as you point out, we can already write a similarly ugly statement "while lambda: None:" but nobody does, and just because existing syntax accidently allows one ugly construct doesn't give an excuse to deliberately add an ugly construct. It's one more special case syntax for beginners to learn. And it really is a special case: there's nothing about "for k:v in iterable" that tells you that iterable must have an items() method. You have to memorise that fact. Being a special case, you can only use this for iterables that have an items() method. You can't do: for k:v in [(1, 'a'), (2, 'b')]: ... because the list doesn't have an items() method. In dict literals and dict comprehensions, the k:v syntax is only used to construct the dict, not to extract items from it. We have a standard way of doing sequence bindings: a, b = ... # right-hand side must be a sequence of two items and the standard way of extracting (key, value) pairs from a mapping is the items() method. If you know that a mapping has only one item, we can even write: [[key, value]] = mapping.items() and sequence unpacking will do the work for us. Do you expect this to work too? [key:value] = mapping This proposed syntactic sugar doesn't add any new functionality or make anything simpler. It just saves you eight keystrokes in one special case. -- Steve
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 27, 2015 at 12:12:09PM +1000, Steven D'Aprano wrote: [...]
It's one more special case syntax for beginners to learn. And it really is a special case: there's nothing about "for k:v in iterable" that tells you that iterable must have an items() method. You have to memorise that fact.
I forgot to say, "or whatever implementation you choose for this syntax". It doesn't necessarily have to be calling the items() method, although the proof of concept given does that. The principle applies either way. -- Steve
data:image/s3,"s3://crabby-images/102be/102be22b252fd381e2db44154ec267297556abaa" alt=""
On Mon, Jul 27, 2015 at 4:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
It's one more special case syntax for beginners to learn. And it really is a special case: there's nothing about "for k:v in iterable" that tells you that iterable must have an items() method. You have to memorise that fact.
This I think is a strong argument. What error would you get when it's the wrong type? An attribute error on .items(), or a special SyntaxError "This syntax can only be used on mappings". Both are quite incomprehensible unless you know exactly what is going on and that this is a shortcut for "fox x,y in foo.items():"
data:image/s3,"s3://crabby-images/291c0/291c0867ef7713a6edb609517b347604a575bf5e" alt=""
On 27.07.2015 13:42, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 4:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
It's one more special case syntax for beginners to learn. And it really is a special case: there's nothing about "for k:v in iterable" that tells you that iterable must have an items() method. You have to memorise that fact. This I think is a strong argument.
I cannot follow. There is nothing about 'await' that tells me it can only be used with coroutines. I need to memorize that fact, too.
What error would you get when it's the wrong type? An attribute error on .items(), or a special SyntaxError "This syntax can only be used on mappings".
I would like such an error. Because it tells me that it is not what I wanted. The current methods silently works and I get an error later. I value the fact of seeing an error as soon as possible. Btw. if the proposed syntax is appropriate is another issue. But I would love to see an improvement on this field.
Both are quite incomprehensible unless you know exactly what is going on and that this is a shortcut for "fox x,y in foo.items():"
Same goes for .items(). It took some time to internalize this special case (at least from my perspective).
data:image/s3,"s3://crabby-images/102be/102be22b252fd381e2db44154ec267297556abaa" alt=""
On Mon, Jul 27, 2015 at 5:30 PM, Sven R. Kunze <srkunze@mail.de> wrote:
I cannot follow. There is nothing about 'await' that tells me it can only be used with coroutines. I need to memorize that fact, too.
No, because you get a syntax error when you use it incorrectly, so you don't need to memorize that. But here it works only with specific types.
I would like such an error. Because it tells me that it is not what I wanted. The current methods silently works and I get an error later.
Well, that is going to be the case now as well, you can't get away from that.
Both are quite incomprehensible unless you know exactly what is going on and that this is a shortcut for "fox x,y in foo.items():"
Same goes for .items(). It took some time to internalize this special case (at least from my perspective).
Sure, but now you have to learn what it is a special case of. All you did was hide that it calls .items(), so the error message "foo does not have an attribute 'items'" becomes harder to understand. You would need to change that error to something else. And it really should be, as you say, a SyntaxError, but it's a SyntaxError that can only be raise in runtime. Which I think breaks most peoples understandning of what a SyntaxError is... //Lennart
data:image/s3,"s3://crabby-images/291c0/291c0867ef7713a6edb609517b347604a575bf5e" alt=""
On 27.07.2015 17:45, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 5:30 PM, Sven R. Kunze <srkunze@mail.de> wrote:
I cannot follow. There is nothing about 'await' that tells me it can only be used with coroutines. I need to memorize that fact, too. No, because you get a syntax error when you use it incorrectly, so you don't need to memorize that. But here it works only with specific types. What's the difference?
Well, that is going to be the case now as well, you can't get away from that. Is it? I don't think so. There are many case where this is not the case.
Both are quite incomprehensible unless you know exactly what is going on and that this is a shortcut for "fox x,y in foo.items():" Same goes for .items(). It took some time to internalize this special case (at least from my perspective). Sure, but now you have to learn what it is a special case of. All you did was hide that it calls .items(), so the error message "foo does not have an attribute 'items'" becomes harder to understand. You would need to change that error to something else. And it really should be, as you say, a SyntaxError, but it's a SyntaxError that can only be raise in runtime. Which I think breaks most peoples understandning of what a SyntaxError is... Nobody said it should be either. That is tiny detail and of course it should be a comprehensible error message.
Btw. no newbie really knows what happens if they execute the default 'for' loop. You could say as well: "don't implement 'for' loops because they hide the fact of calling 'next'".
data:image/s3,"s3://crabby-images/102be/102be22b252fd381e2db44154ec267297556abaa" alt=""
On Mon, Jul 27, 2015 at 6:02 PM, Sven R. Kunze <srkunze@mail.de> wrote:
On 27.07.2015 17:45, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 5:30 PM, Sven R. Kunze <srkunze@mail.de> wrote:
I cannot follow. There is nothing about 'await' that tells me it can only be used with coroutines. I need to memorize that fact, too.
No, because you get a syntax error when you use it incorrectly, so you don't need to memorize that. But here it works only with specific types.
What's the difference?
Well, for one, one is a runtime error and the other is not.
Well, that is going to be the case now as well, you can't get away from that.
Is it? I don't think so. There are many case where this is not the case.
No, there isn't. The proposed syntax will work if the variable is a mapping, but fail if it is any other type. The type will *only* be known once it's time to execute that statement. But sure, the same goes for "for x in y:" really. That only works with iterables. So maybe this isn't a problem.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 27, 2015 at 05:30:38PM +0200, Sven R. Kunze wrote:
On 27.07.2015 13:42, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 4:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
It's one more special case syntax for beginners to learn. And it really is a special case: there's nothing about "for k:v in iterable" that tells you that iterable must have an items() method. You have to memorise that fact. This I think is a strong argument.
I cannot follow. There is nothing about 'await' that tells me it can only be used with coroutines. I need to memorize that fact, too.
Yes, and you need to memorise what "for" loops do, and "len()", etc. But if you know English, the name is an aid to memory. There's no aid to memory with a:b syntax, and googling for it will be a pain. Not everything is important enough to be given its own syntax. That way leads past Perl and into APL. (At least APL tries to follow standard mathematical notation, rather than being a collection of arbitrary symbols.) `await` gives us a whole lot of new functionality that was hard or impossible to do before. What does this give us that we couldn't do before? What's so special about spam.items() that it needs dedicated syntax for it? These are not rhetorical questions. If you can answer those positively, then I'll reconsider my opposition to this. But if the only thing this syntax gains us is to avoid an explicit call to .items(), then it just adds unnecessary cruft to the language.
What error would you get when it's the wrong type? An attribute error on .items(), or a special SyntaxError "This syntax can only be used on mappings".
I would like such an error. Because it tells me that it is not what I wanted. The current methods silently works and I get an error later.
I don't understand you. If you write `for k,v in spam.items()` and spam has no items method, you get an AttributeError immediately. How does it silently work? -- Steve
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 27, 2015 at 1:42 PM, Lennart Regebro <regebro@gmail.com> wrote:
On Mon, Jul 27, 2015 at 4:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
It's one more special case syntax for beginners to learn. And it really is a special case: there's nothing about "for k:v in iterable" that tells you that iterable must have an items() method. You have to memorise that fact.
This I think is a strong argument.
What error would you get when it's the wrong type? An attribute error on .items(), or a special SyntaxError "This syntax can only be used on mappings". Both are quite incomprehensible unless you know exactly what is going on and that this is a shortcut for "fox x,y in foo.items():"
I think that should be "TypeError: 'foo' object is not a mapping" – similarly to:
for x in 123: ... pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'int' object is not iterable
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 27, 2015 at 12:12 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Being a special case, you can only use this for iterables that have an items() method. You can't do:
for k:v in [(1, 'a'), (2, 'b')]: ...
because the list doesn't have an items() method.
Here's a crazy alternative: Generalize it to subsume the common use of enumerate(). Iterate over a dict thus: for name:obj in globals(): # do something with the key and/or value And iterate over a list, generator, or any other simple linear iterable thus: for idx:val in sys.argv: # do something with the arg and its position In other words, the two-part iteration mode gives you values *and their indices*. If an object declares its own way of doing this, it provides the keys and values itself; otherwise, the default is equivalent to passing it through enumerate, so you'll get sequential numbers from zero. I don't know that this is a *good* idea (for one thing, simple iteration is equivalent to the first part for a dict, but the second part for everything else), but it does give a plausible meaning to two-part iteration that isn't over a dictionary. ChrisA
data:image/s3,"s3://crabby-images/becb0/becb0e095c5bd09b8ccb4a887c52fcdbb7040ff9" alt=""
Here's a crazy alternative: Generalize it to subsume the common use of enumerate(). Iterate over a dict thus:
for name:obj in globals(): # do something with the key and/or value
And iterate over a list, generator, or any other simple linear iterable thus:
for idx:val in sys.argv: # do something with the arg and its position
In other words, the two-part iteration mode gives you values *and their indices*. If an object declares its own way of doing this, it provides the keys and values itself; otherwise, the default is equivalent to passing it through enumerate, so you'll get sequential numbers from zero.
Well it may well be crazy but somewhere deep inside i actually quite like it.. Certainly more than a special syntax that only works on dicts.., and its quite a common use case imo.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 28, 2015 at 01:19:48AM +1000, Chris Angelico wrote:
Here's a crazy alternative: Generalize it to subsume the common use of enumerate(). Iterate over a dict thus:
for name:obj in globals(): # do something with the key and/or value
And iterate over a list, generator, or any other simple linear iterable thus:
for idx:val in sys.argv: # do something with the arg and its position
Yep, that's a crazy alternative alright :-) Okay, so we start with this: mapping = {'key': 'value', ...} for name:obj in mapping: log(name) process_some_object(obj) Then, one day, somebody passes this as mapping: mapping = [('key', 'value'), ...] and the only hint that something has gone wrong is that your logs contain 0 1 2 3 ... instead of the expected names. That will be some fun debugging, I'm sure. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 28, 2015 at 1:39 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Okay, so we start with this:
mapping = {'key': 'value', ...} for name:obj in mapping: log(name) process_some_object(obj)
Then, one day, somebody passes this as mapping:
mapping = [('key', 'value'), ...]
and the only hint that something has gone wrong is that your logs contain 0 1 2 3 ... instead of the expected names. That will be some fun debugging, I'm sure.
Except that that transformation already wouldn't work. How do you currently do the iteration over a dictionary? # Boom! AttributeError. for name,obj in mapping.items(): # Completely different semantics for name in mapping: obj = mapping[name] I don't know of any iteration method that's oblivious to the difference between a dict and a list of pairs; can you offer (toy) usage examples? ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 28, 2015 at 01:54:09AM +1000, Chris Angelico wrote:
On Tue, Jul 28, 2015 at 1:39 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Okay, so we start with this:
mapping = {'key': 'value', ...} for name:obj in mapping: log(name) process_some_object(obj)
Then, one day, somebody passes this as mapping:
mapping = [('key', 'value'), ...]
and the only hint that something has gone wrong is that your logs contain 0 1 2 3 ... instead of the expected names. That will be some fun debugging, I'm sure.
Except that that transformation already wouldn't work.
Exactly! The fact that it *doesn't work* with an explicit call to .items() is a good thing. You get an immediate error, the code doesn't silently do the wrong thing. Your suggestion silently does the wrong thing. If you want to support iteration over both mappings and sequences of (key,value) tuples, you need to make a deliberate decision to do so. You might use a helper function: for name, obj in pairwise_mapping(items): ... where pairwise_mapping contains the smarts to handle mappings and (key,value) tuples. And that's fine, because it is deliberate and explicit, not an accident of the syntax. In effect, your suggestion makes the a:b syntax a "Do What I Mean" operation. It tries to gues whether you want to call expr.items() or enumerate(expr). Building DWIM into the language is probably not a good idea.
I don't know of any iteration method that's oblivious to the difference between a dict and a list of pairs; can you offer (toy) usage examples?
You have to handle it yourself. The dict constructor, and update method, do that. dict.update's docstring says: | update(...) | D.update(E, **F) -> None. Update D from dict/iterable E and F. | If E has a .keys() method, does: for k in E: D[k] = E[k] | If E lacks .keys() method, does: for (k, v) in E: D[k] = v | In either case, this is followed by: for k in F: D[k] = F[k] But notice that in your case, passing (key,value) doesn't give you the name=key, obj=value results you wanted. You get name=index, obj=(key,value) instead! DWIM is fine and dandy when it guesses what you want correctly, but when it doesn't, it silently does the wrong thing instead of giving you an immediate exception. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 28, 2015 at 2:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
In effect, your suggestion makes the a:b syntax a "Do What I Mean" operation. It tries to gues whether you want to call expr.items() or enumerate(expr). Building DWIM into the language is probably not a good idea.
I see what you mean. Yes, there's no easy way to iterate over either type, but that isn't the point. What my suggestion was positing was not so much DWIM as "iterate over the keys and values of anything". A mapping type has a concept of keys and values; an indexable sequence (list, tuple, etc) uses sequential numbers as indices and its members as values. (A set might choose to iterate this way by calling its members the indices, and using a fixed True as the value every time.) This would create a new iteration invariant. We currently have: for x in y: assert x in y With this, we would have: for k:v in x: assert x[k] is v And it should ideally raise an exception if this can't be done. (Which currently would be the case for sets, so my suggestion above would have to be accompanied by a set indexing definition that returns the same fixed value for anything that's in it - something like "def __getitem__(self, item): return item in self".) Now, I'm still not saying this is a *good* idea. But I do think it's internally consistent. Note that a simple definition using enumerate() would violate the assertion, as you could use this to iterate over a non-sequence and get indices and values. I'm not sure whether it's better to promise a simple invariant (ie non-sequences should raise TypeError if used in this way), or to adopt the stance of practicality and permit this. Both make sense. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 28, 2015 at 02:25:00AM +1000, Chris Angelico wrote:
What my suggestion was positing was not so much DWIM as "iterate over the keys and values of anything". A mapping type has a concept of keys and values; an indexable sequence (list, tuple, etc) uses sequential numbers as indices and its members as values. .............^^^^^^^
Keys and indices are not the same thing, and Python is not Lua. While there are certain similarities between the indices of a sequence and the keys of a mapping, they really don't play the same role in any meaningful sense. Consider: x in mapping # looks for a key x x in sequence # looks for a value x, not an index ("key") x Consider this too: mapping = {i:i**2 for i in range(1000)} sequence = [i**2 for i in range(1000)] for obj in (mapping, sequence): while 0 in obj.keys(): del obj[0] assert 0 not in obj.keys() The first problem is that lists don't have a keys() method. But that's okay, pretend that we've added one. Now your problems have only begun: len(mapping) # returns 1000-1 len(sequence) # returns 0 Well that sucks. Deleting (or inserting) a item into a sequence potentially changes the "keys" of all the other items. What sort of a mapping does that? Despite the apparent analogy of key <=> index, it's remarkable hard to think of any practical use for such a thing. I cannot think of any time I have wanted to, or might want to in the future, ducktype lists as dicts, with the indices treated as keys. The closest I can come up to is to support Lua-like arrays implemented as tables (mappings). When you create an array in Lua, it's not actually an array like in Python or a linked-list like in Lisp, but a (hash) table where the keys are automatically set to 1, 2, ... n by the interpreter. But that's just sugar for convenience. Lua arrays are still tables, they merely emulate arrays. And besides, that's the opposite: treating keys as indices, not indices as keys. Apart from practical problems such as the above, there's also a conceptual problem. Keys of a mapping are *intrinsic* properties of the mapping. But indices of a sequence are *extrinsic* properties. They aren't actually part of the sequence. Given the list [2,4,6] the "key" (actually index) 0 is not part of the list in any way. Some languages, like C and Python, treat those indices as starting from 0. Others treat them as starting from 1. Fortran and Pascal, if I remember correctly, let you index arrays over any contiguous range of integers, including negatives: foo = array[-20...20] of integer; or something like that. Conveniently the way we access keys and indices reflects this. Keys, being intrinsic to the mapping, is a method: mapping.keys() while indices, being extrinsic, is a function which can be applied to any iterable, with any starting value: enumerate(sequence, 1) enumerate(mapping, -5) [... snip proposal to treat sets {element} as {element:True} ...]
This would create a new iteration invariant. We currently have:
Why do we need this invariant? What does it gain us to be able to say myset[element] and get True back, regardless of the value of element? Why not just say: True We can invent any invariants we like, but if they're not useful, why add cruft to the language to support something that we aren't going to use? -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 28, 2015 at 5:17 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jul 28, 2015 at 02:25:00AM +1000, Chris Angelico wrote:
What my suggestion was positing was not so much DWIM as "iterate over the keys and values of anything". A mapping type has a concept of keys and values; an indexable sequence (list, tuple, etc) uses sequential numbers as indices and its members as values. .............^^^^^^^
Keys and indices are not the same thing, and Python is not Lua. ... Despite the apparent analogy of key <=> index, it's remarkable hard to think of any practical use for such a thing. I cannot think of any time I have wanted to, or might want to in the future, ducktype lists as dicts, with the indices treated as keys.
A namedtuple is completely different from a list, too. But you can iterate over both. A generator is utterly different again, and you can iterate over that the exact same way. Are you ducktyping namedtuples as lists, with their attributes in definition order? Or are all of the above simply special cases of "thing you can iterate over to get a series of values"? Many types have a concept of keys/indices and their associated values. Yes, you're right, removing an element from a list changes the indices of all those after it; but the same goes for any sort of mutation of an iterable. With a dict, adding a new key/value pair can change the order of all the others. Does the fact that a list would never dare do such a thing mean that you shouldn't iterate over dicts and lists using the same syntax? Clearly not, because we can already do precisely that. You already have to be careful of mutating the thing you're iterating over:
l=[1,2,3] for x in l: ... l.remove(x) ... print(x) ... 1 3 l [2]
Apart from practical problems such as the above, there's also a conceptual problem. Keys of a mapping are *intrinsic* properties of the mapping. But indices of a sequence are *extrinsic* properties. They aren't actually part of the sequence. Given the list [2,4,6] the "key" (actually index) 0 is not part of the list in any way.
Not sure the significance of this; whatever the indices are, they do exist. There is a canonical index for the value 2, and it can be determined by the aptly-named index() method:
[2,4,6].index(2) 0
If you were to iterate over that list in some way which pairs indices and values, it would give index 0 with value 2, index 1 with value 4, index 2 with value 6, and StopIteration. This is the behaviour of enumerate(), and nobody has ever complained that this is a bad way to work with list indices.
Conveniently the way we access keys and indices reflects this. Keys, being intrinsic to the mapping, is a method:
mapping.keys()
while indices, being extrinsic, is a function which can be applied to any iterable, with any starting value:
enumerate(sequence, 1) enumerate(mapping, -5)
Not sure the point of this distinction, especially given that the starting value has to be 0 if the indexing into the original sequence is to work.
[... snip proposal to treat sets {element} as {element:True} ...]
This would create a new iteration invariant. We currently have:
Why do we need this invariant? What does it gain us to be able to say
myset[element]
and get True back, regardless of the value of element? Why not just say:
True
We can invent any invariants we like, but if they're not useful, why add cruft to the language to support something that we aren't going to use?
The invariant is nothing to do with treating {element} as {element:True}, that was just an example of how different types could viably respond to this kind of protocol. The invariant comes from the definition of index-value iteration, which is that iterable[index] is value. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 28, 2015 at 05:44:13AM +1000, Chris Angelico wrote:
On Tue, Jul 28, 2015 at 5:17 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jul 28, 2015 at 02:25:00AM +1000, Chris Angelico wrote:
What my suggestion was positing was not so much DWIM as "iterate over the keys and values of anything". A mapping type has a concept of keys and values; an indexable sequence (list, tuple, etc) uses sequential numbers as indices and its members as values. .............^^^^^^^
Keys and indices are not the same thing, and Python is not Lua. ... Despite the apparent analogy of key <=> index, it's remarkable hard to think of any practical use for such a thing. I cannot think of any time I have wanted to, or might want to in the future, ducktype lists as dicts, with the indices treated as keys.
A namedtuple is completely different from a list, too. But you can iterate over both. [...]
Yes? What's your point? I fail to see how any of this is relevant to the analogy "indices of a sequence are mapping keys". Bottom line: Can you give a non-contrived, non-toy, practical example of where someone might want to seemlessly interchange (key,value) pairs from a mapping and (index,item) pairs from a sequence and expect to do something useful? Toy programming exercises like "print a table of key/index and value/item" aside: 0 1.0 1 2.0 2 4.0 3 16.0 That's a nice exercise for beginners, but doesn't justify new syntax. As I point out with the example of Lua, you can get quite far with the analogy "consecutive integer mapping keys are like indices". It's the other way which is dubious: indices aren't like keys in general, and I don't think there are many, if any, use-cases for treating sequences (let alone sets, let alone arbitrary iterables) as if they were a special case of mapping. But, you're proposing this. It shouldn't be up to me to prove that it's not useful. It should be up to you to prove that it is.
Many types have a concept of keys/indices and their associated values. Yes, you're right, removing an element from a list changes the indices of all those after it; but the same goes for any sort of mutation of an iterable. With a dict, adding a new key/value pair can change the order of all the others.
The order of a mapping is generally not part of it's API. Your observation that adding an item to a dict may change the order of other items is not relevant. The point I am making is that deleting a key from a mapping doesn't change the *keys* of all the other items: del mapping[0] does not change the key 1 into 0, or key 2 into 1. But del sequence[0] does change the index of the items. Indices don't behave like keys! If you want to unify (key,value) and (index,item) as special cases of the same kind of thing, then you need to - justify how they can be the same when they behave so differently; and - explain how this makes Python a better language. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 28, 2015 at 2:05 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Bottom line:
Can you give a non-contrived, non-toy, practical example of where someone might want to seemlessly interchange (key,value) pairs from a mapping and (index,item) pairs from a sequence and expect to do something useful? Toy programming exercises like "print a table of key/index and value/item" aside:
0 1.0 1 2.0 2 4.0 3 16.0
That's a nice exercise for beginners, but doesn't justify new syntax.
The most common case where you need the keys as well as the values is when you're working with parallel structures. Here's one with lists: tags = ["p", "li", "div", "body"] weights = [50, 30, 60, 10] counts = [0]*4 for idx, tag in enumerate(tags): if blob.find(tag) > weights[idx]: counts[idx] += 1 Yes, there are other ways you can structure this, but sometimes other considerations mean it's better to keep them separate and then iterate together. For read-only iteration you can of course zip() them together, but if you need to update something, that's a bit harder. In fact, that's probably a use-case as well, although I've never personally used it in real-world code: for idx, val in some_list: if condition: some_list[idx] *= whatever Now here's a dictionary-based equivalent: # Parallel iteration/mutation questions = { "color": "What color would you like your bikeshed to be?", "size": "How many bikes do you need to house?", "material": "Should the shed be made of metal, wood, or paper?", "location": "Whose backyard should we not build this in?", } defaults = {"color": "red", "size": "2", "material": "wood", "location": "City Hall"} answers = {} for kwd, msg in questions.items(): response = input("%s [%s] " % (msg, defaults[kwd])) if response == "q": break # see, can't use a list comp here answers[kwd] = response or defaults[kwd] You could think of this as a sequence (tuple or list), or as a keyword mapping. Both ways make reasonable sense, and either way, you need to know what the key/index is that you're working on.
But, you're proposing this. It shouldn't be up to me to prove that it's not useful. It should be up to you to prove that it is.
Well, I'm not pushing for this to be added to the language. I'm aiming much lower than that: merely that the idea is internally consistent, and satisfies the OP's need. I fully expect it to still be YAGNI rejected, but I believe it makes sense to ask the question, at least. ChrisA
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 7/27/2015 3:17 PM, Steven D'Aprano wrote:
On Tue, Jul 28, 2015 at 02:25:00AM +1000, Chris Angelico wrote:
What my suggestion was positing was not so much DWIM as "iterate over the keys and values of anything". A mapping type has a concept of keys and values; an indexable sequence (list, tuple, etc) uses sequential numbers as indices and its members as values. .............^^^^^^^
Keys and indices are not the same thing, and Python is not Lua.
Both sequences and dicts can be viewed and used as functions over a finite domain. This is pretty common. If one does, *then* the keys and indices serve the same role as inputs.
While there are certain similarities between the indices of a sequence and the keys of a mapping, they really don't play the same role in any meaningful sense.
If one uses lists or dicts to implement functions as sets with efficient access, then indecex/keys both play the same role as inputs. In general, keys/indexes are both efficient means to retrieve objects. The access issue is precisely why we use dicts more that sets.
Consider:
x in mapping # looks for a key x x in sequence # looks for a value x, not an index ("key") x
For a function, 'in' looks for an input/output pair, so both the above are wrong for this usage.
Consider this too:
mapping = {i:i**2 for i in range(1000)} sequence = [i**2 for i in range(1000)]
Construct a function as 'set' by adding one pair at a time.
for obj in (mapping, sequence): while 0 in obj.keys(): del obj[0]
Deleting a pair from a function is a dubious operation.
assert 0 not in obj.keys()
The first problem is that lists don't have a keys() method. But that's okay, pretend that we've added one. Now your problems have only begun:
len(mapping) # returns 1000-1 len(sequence) # returns 0
Well that sucks.
Right. If deletion *is* allowed, then it must be limited to .pop() for list implementations of functions.
Despite the apparent analogy of key <=> index, it's remarkable hard to think of any practical use for such a thing. I cannot think of any time I have wanted to, or might want to in the future, ducktype lists as dicts, with the indices treated as keys.
Python already ducktypes list/dict and index/key by using the same subscript notation and corresponding special methods for both. Because of this, it is possible to write algorithm that work with both a list (or list of lists) and a dict or DefaultDict or subset of either.
Apart from practical problems such as the above, there's also a conceptual problem.
We seem to have different ideas of 'practical' versus 'conceptual'.
Keys of a mapping are *intrinsic* properties of the mapping.
To me, this is as least in part a practical implementation issue. In a function, keys are intrinsically part of pairs. For a dict in general, they are not necessarily intrinsic properties of the values. In a namespace, the names are arbitrary access tools and not necessarily unique. There is no sense in which the objects are functionally derived from the names.
But indices of a sequence are *extrinsic* properties.
Once the first item is given an index, the rest of the indexes follow. If the items are functionally derived from the indexes, then even the first index is not arbitrary. In spite of everything above, I am pretty dubious about adding x:y as an iteration target: I have no problem with mapping.items() and enumerate(iterable). -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 27, 2015 at 05:48:21PM -0400, Terry Reedy wrote:
On 7/27/2015 3:17 PM, Steven D'Aprano wrote:
On Tue, Jul 28, 2015 at 02:25:00AM +1000, Chris Angelico wrote:
What my suggestion was positing was not so much DWIM as "iterate over the keys and values of anything". A mapping type has a concept of keys and values; an indexable sequence (list, tuple, etc) uses sequential numbers as indices and its members as values. .............^^^^^^^
Keys and indices are not the same thing, and Python is not Lua.
Both sequences and dicts can be viewed and used as functions over a finite domain. This is pretty common. If one does, *then* the keys and indices serve the same role as inputs.
If you are talking about the fact that both dict and list subscript notation spam[x] is, in some sense, equivalent to the mathematical concept of a function that maps a single argument to some value, x -> f(x), then I understand *what* you are saying, but not *why* it is relevant. In Python, neither sequences nor mappings have the same API as functions, or are considered to be the same type of object. [...]
Consider:
x in mapping # looks for a key x x in sequence # looks for a value x, not an index ("key") x
For a function, 'in' looks for an input/output pair, so both the above are wrong for this usage.
For a function, `in` fails with TypeError: py> 42 in chr Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: argument of type 'builtin_function_or_method' is not iterable I'm afraid the gist of your post and the connection between functions and mappings is to abstract for me to understand. -- Steve
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 27, 2015, at 21:17, Steven D'Aprano <steve@pearwood.info> wrote:
Keys and indices are not the same thing, and Python is not Lua.
It may be worth doing a survey of other languages to see how they handle this. I think the most common thing is to have dict iteration yield either (key, value) pairs, especially in languages with pattern matching or at least basic Python-style tuple decomposition. For example, in Swift, you write `for (key, val) in d {...}`. Another common thing in more Java-ish languages is to yield special item objects, like C# KeyValuePair<T>, which you'd use as `foreach(var item in myDictionary { spam(item.Key, item.Value); }`. Some languages treat dictionaries as iterables of keys, like Python. PHP does have something like this proposal: `foreach ($d as $k=>$v) {...}` vs. `foreach ($d as $k) {...}`. So does Go, although its syntax is `k, v` for a key-value pair vs. `k` for just the key (which obviously wouldn't work with Python-style tuple decomposition). I vaguely remember Tcl having something relevant here but I can't remember what it was. The only other language I can think of that does anything like allowing you treat a list as a mapping from indices is JS (and its various offshoots), but their for loop is really treating everything as an object, iterating both keys and methods (since they're both the same thing), and in the case of an array you get the indices in arbitrary order, which is why documentation tells you that you probably don't want a for loop over an array. (They did add a foreach method to solve that, but it gives you key, index pairs for objects and value, index pairs for arrays.) Anyway, if someone can think of a language that does what's being proposed here, it should be easier to find out whether users' experience with that feature is positive or negative.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 28, 2015 at 11:29 AM, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Jul 27, 2015, at 21:17, Steven D'Aprano <steve@pearwood.info> wrote:
Keys and indices are not the same thing, and Python is not Lua.
It may be worth doing a survey of other languages to see how they handle this.
Pike has two different forms of iteration: foreach (some_object, value) foreach (some_object; index; value) The first form works on arrays and such - sequences. It's fundamentally the same thing as Python's existing iteration. The second form behaves the way I'm describing, and was the inspiration for it :) But Pike's iterables are a lot more restricted than Python's, so it's easier there. ChrisA
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 27, 2015 at 5:19 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Mon, Jul 27, 2015 at 12:12 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Being a special case, you can only use this for iterables that have an items() method. You can't do:
for k:v in [(1, 'a'), (2, 'b')]: ...
because the list doesn't have an items() method.
Here's a crazy alternative: Generalize it to subsume the common use of enumerate(). Iterate over a dict thus:
for name:obj in globals(): # do something with the key and/or value
And iterate over a list, generator, or any other simple linear iterable thus:
for idx:val in sys.argv: # do something with the arg and its position
Keys and values are very different things than indices and items. Using the same syntax for retrieval from mappings and sequences is OK, but I don't see why other operations on them, and especially this one, would need to be similar. "Two-part iteration" is not the default/obvious way to loop over a list, so I don't think it should use special syntax. A method works just fine here.
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 27, 2015 at 4:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Jul 26, 2015 at 06:09:17PM +0200, Petr Viktorin wrote:
Hello, Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often. Every time I do, it seems it's boilerplate;
What part looks like boilerplate? The "for"? The "key,value"? The "in"? The "a_dict"? If none of them are boilerplate, why would ".items()" be boilerplate?
Yes, the .items(). I got the courage to post here after a EuroPython talk, "Through the lens of Haskell", where we discussed that unlike other languages, where libraries can define new operators or even syntax for common operations, Python tends to standardize syntax for common operations, and ends up with a few pieces of syntax and a few common interfaces that similar objects then implement. And so, Python ends up using punctuation for common cases, like: value = mapping[key] and methods for value = mapping.get(key, default) The first is the "obvious way" to do it; I can grok its meaning quickly just from the "shape" of the line. At a glance I can tell that "mapping" needs to be some container. In the second case something extra is going on, and parsing the word "get" needs a bit of extra cognitive overhead to alert me to this. I read that "mapping" needs to be an object with the "get" method. Similarly, when I read: for key, value in mapping.items() it looks like something "extra" is going on: it's a loop over tuples that contain the key and value. On the other hand, the proposed for key: value in mapping: would read, to me, as looping over all data in a dict. Of course there is a cost: new punctuation does need to be learned. Expressions like "{x: y for x in ...}" or "head, *tail = seq" or even "p = []" aren't obvious until you go through the Python 101. My assertion was that key-value looping is common enough (i.e. used in almost every nontrivial program), and the proposed syntax is close enough to similar uses of the colon (as a key-value separator), to justify every Python developer learning it. Now I know several core devs disagree with that, which means Python will probably be better without it. Thanks for the discussion, python-ideas!
data:image/s3,"s3://crabby-images/f576b/f576b43f4d61067f7f8aeb439fbe2fadf3a357c6" alt=""
Petr Viktorin <encukou@gmail.com> writes:
Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often. Every time I do, it seems it's boilerplate; it looks a like a workaround rather than a preferred way of doing things.
I am sympathetic to this complaint. It does seem that mapping, for all their “obvious first choice” as a data structure, are more cumbersome to iterate through than other sequences. I tend to write the above as:: for (key, value) in a_dict.items(): # ... because it's easier to see that the items that come from the view are themselves two-item tuples which are then unpacked.
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
Hmm, that's a bit too easy to misread for my liking. A colon in the middle of a line, without clear parenthesis syntax nearby, looks too much like a single-line compound statement:: if foo: bar while True: flonk for key: value in a_dict: I would be only +0 on the above ‘for’ syntax, and would prefer that it remains a SyntaxError. Analogous to what I described above for the tuple unpacking, how about this:: for {key: value} in a_dict: # ... That makes the correspondence with a mapping much less ambiguous, and it clearly marks the whole item which will be emitted by the iteration. -- \ “There's a certain part of the contented majority who love | `\ anybody who is worth a billion dollars.” —John Kenneth | _o__) Galbraith, 1992-05-23 | Ben Finney
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 27, 2015 at 6:23 AM, Ben Finney <ben+python@benfinney.id.au> wrote:
Petr Viktorin <encukou@gmail.com> writes: [...]
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
Hmm, that's a bit too easy to misread for my liking.
A colon in the middle of a line, without clear parenthesis syntax nearby, looks too much like a single-line compound statement::
if foo: bar while True: flonk for key: value in a_dict:
I would be only +0 on the above ‘for’ syntax, and would prefer that it remains a SyntaxError.
Analogous to what I described above for the tuple unpacking, how about this::
for {key: value} in a_dict: # ...
That makes the correspondence with a mapping much less ambiguous, and it clearly marks the whole item which will be emitted by the iteration.
On the other hand, parenthesizing it makes it look like an expression, that is, something that can be part of a larger expression. Key/value unpacking only works as a target of a "for".
data:image/s3,"s3://crabby-images/73079/73079767e27b02f7b2f3a1db918021c1486cb43c" alt=""
2015-07-28 19:39 GMT+02:00 Petr Viktorin <encukou@gmail.com>:
On Mon, Jul 27, 2015 at 6:23 AM, Ben Finney <ben+python@benfinney.id.au> wrote:
On the other hand, parenthesizing it makes it look like an expression, that is, something that can be part of a larger expression. Key/value unpacking only works as a target of a "for".
If the proposal was accepted for "for k:v in iterable" then I suppose that "if k:v in iterable" would also be valid, meaning that for a dict, there is a pair (k, v) such that _dict[k] = v, and for a list that there is an index k such that _list[k] = v. for k:v in iterable: assert k:v in iterable
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 July 2015 at 16:22, Pierre Quentel <pierre.quentel@gmail.com> wrote:
If the proposal was accepted for "for k:v in iterable" then I suppose that "if k:v in iterable" would also be valid, meaning that for a dict, there is a pair (k, v) such that _dict[k] = v, and for a list that there is an index k such that _list[k] = v.
for k:v in iterable: assert k:v in iterable
This actually made me think of the quirky signatures of the dict constructor and dict.update, where it's possible to pass in either a mapping *or* an iterable of two-tuples: https://docs.python.org/3/library/stdtypes.html#dict.update If we went with the assumption that this syntax, if added, used those semantics, then you could reliably build (assuming hashable values) a reverse lookup table as: reverse_lookup = {v:k for k:v in data_source} At the moment, you have to restrict your input to mappings specifically: reverse_lookup = {v:k for k,v in data_source.items()} Or an iterable of 2-tuples: reverse_lookup = {v:k for k,v in data_source} Or use duck-typing: if hasattr(data_source, "items"): data_source = data_source.items() reverse_lookup = {v:k for k,v in data_source} I'd still be -0 on such a proposal with dict.update iteration semantics (as much as I think it's neat, I don't think the practical benefit is there to justify it), but it does have the virtue of extracting a particular iteration pattern from an existing builtin type. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/29/2015 05:10 AM, Nick Coghlan wrote:
On 29 July 2015 at 16:22, Pierre Quentel<pierre.quentel@gmail.com> wrote:
If the proposal was accepted for "for k:v in iterable" then I suppose that "if k:v in iterable" would also be valid, meaning that for a dict, there is a pair (k, v) such that _dict[k] = v, and for a list that there is an index k such that _list[k] = v.
for k:v in iterable: assert k:v in iterable
This actually made me think of the quirky signatures of the dict constructor and dict.update, where it's possible to pass in either a mapping*or* an iterable of two-tuples: https://docs.python.org/3/library/stdtypes.html#dict.update
If we went with the assumption that this syntax, if added, used those semantics, then you could reliably build (assuming hashable values) a reverse lookup table as:
reverse_lookup = {v:k for k:v in data_source}
At the moment, you have to restrict your input to mappings specifically:
reverse_lookup = {v:k for k,v in data_source.items()}
Or an iterable of 2-tuples:
reverse_lookup = {v:k for k,v in data_source}
I'm still wondering how it would work underneath and what other things it implies. kv = "red":6 # What would this do? k:v = ("red", 6) # Would this work too? k, v = "red":6 # Or this? Currently __contains__ on dictionaries only checks the keys. So can this be made to work? k:v in D # Test for key:value pair. Currently... D[k] == v I wonder if k:v was to create an key_value object. This isn't that different than an operator/object that consumes other objects. bool = key:value(dict) # KeyValue(key, value)(dict) Or it could work like slice objects. def __contains__(self, other): if isinstance(other, KeyValue): ... ... Or maybe have it as a special case on for loops only, but is that special case special enough?
Or use duck-typing:
if hasattr(data_source, "items"): data_source = data_source.items() reverse_lookup = {v:k for k,v in data_source}
I'd still be -0 on such a proposal with dict.update iteration semantics (as much as I think it's neat, I don't think the practical benefit is there to justify it), but it does have the virtue of extracting a particular iteration pattern from an existing builtin type.
I'm -0.1, but only because I think it could create confusing cases like ... if this works, then why can't this... kind of things. Cheers, Ron
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2015-07-29 07:22, Pierre Quentel wrote:
2015-07-28 19:39 GMT+02:00 Petr Viktorin <encukou@gmail.com <mailto:encukou@gmail.com>>:
On Mon, Jul 27, 2015 at 6:23 AM, Ben Finney <ben+python@benfinney.id.au <mailto:ben%2Bpython@benfinney.id.au>> wrote:
On the other hand, parenthesizing it makes it look like an expression, that is, something that can be part of a larger expression. Key/value unpacking only works as a target of a "for".
If the proposal was accepted for "for k:v in iterable" then I suppose that "if k:v in iterable" would also be valid, meaning that for a dict, there is a pair (k, v) such that _dict[k] = v, and for a list that there is an index k such that _list[k] = v.
"if k:v in iterable" is already valid syntax. It's "if k:", which is an if statement, followed by "v in iterable".
for k:v in iterable: assert k:v in iterable
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Petr Viktorin writes:
Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often.
Sure, but the obvious syntax: for key, value in a_dict: is already taken: it unpacks the key if it happens to be a tuple. I've always idly wondered why iteration over a mapping was taken to be an iteration over keys rather than over items. Idling just a little bit faster, I wonder if this isn't a throwback to the days when sets were emulated by dictionaries with constant value (eg, None). I'm hard put to think of the last time I wanted to actually iterate over keys, doing something *other* than extracting the value. However, given that the choice was made to iterate over keys rather than items, it doesn't bother me to put in explicit calls to .items or .values where needed.
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
This screams SyntaxError to me. Sure, I can figure out what's meant, but the cognitive burden would be large every time I saw it. More generally, YMMV but I don't see any real point in adding syntax for this. Steve
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 27 July 2015 at 20:21, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Petr Viktorin writes:
Currently, the way to iterate over keys and values of a mapping is to call items() and iterate over the resulting view::
for key, value in a_dict.items(): print(key, value)
I believe that looping over all the data in a dict is a very imporant operation, and I find myself writing this quite often.
Sure, but the obvious syntax:
for key, value in a_dict:
is already taken: it unpacks the key if it happens to be a tuple. I've always idly wondered why iteration over a mapping was taken to be an iteration over keys rather than over items. Idling just a little bit faster, I wonder if this isn't a throwback to the days when sets were emulated by dictionaries with constant value (eg, None). I'm hard put to think of the last time I wanted to actually iterate over keys, doing something *other* than extracting the value.
Looking up the original iterator PEP shows it was done to enforce the container invariant "for x in y: assert x in y". So that has_key() -> __contains__() change came first, and drove the subsequent selection of iterkeys() as the meaning of mapping iteration. At least, that's my reading of the dictionary iterator section in https://www.python.org/dev/peps/pep-0234/
In dict comprehensions and literals, key-value pairs are separated by colons. How about allowing that in for loops as well?
for key: value in a_dict: print(key, value)
This screams SyntaxError to me. Sure, I can figure out what's meant, but the cognitive burden would be large every time I saw it.
More generally, YMMV but I don't see any real point in adding syntax for this.
One point in favour is that many, many, years after ABCs were introduced at least in part to disambiguate the Sequence and Mapping APIs, we'd finally have a separate ducktyping protocol that was unique to mappings :) However, overall, I have to come down in the "-1" camp as well. With dict comprehensions, the dict syntax changes the type of the object produced, and matches the syntax of normal dict displays. In this case, the colon is present without its surrounding curly braces, so the prompts to think "dictionary" aren't as strong as they are in the comprehension case. Embedding this novel iteration syntax in comprehensions would make that confusion even worse. Since there'd still be a method call under the hood, the new syntax also wouldn't offer a performance benefit over calling the items() method explicitly. I actually quite liked the idea on a first impression, but it doesn't appear to hold up to closer scrutiny. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Nick Coghlan writes:
Looking up the original iterator PEP shows it was done to enforce the container invariant "for x in y: assert x in y".
I would argue that in the context of a given mapping, <key> and [<key>, <value>] are equivalent when <value> is <map>[<key>], so that we shou ld have the signature __contains__(self, key, value=self[key]), which is a NameError, but the intent should be obvious. The evident suggestion that we distinguish between tuples (which can be used as keys) and items by implementing the latter as lists (which can't, at least in the case of dictionaries) seems fragile and icky, though, and unacceptable at this point since it's incompatible with the .items() iterator.
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/28/2015 05:46 AM, Stephen J. Turnbull wrote:
Nick Coghlan writes:
Looking up the original iterator PEP shows it was done to enforce the container invariant "for x in y: assert x in y".
I would argue that in the context of a given mapping, <key> and [<key>, <value>] are equivalent when <value> is <map>[<key>], so that we shou ld have the signature __contains__(self, key, value=self[key]), which is a NameError, but the intent should be obvious.
If keys were an actual objects, with a binding to a value, then I think they would be equivalent. But I don't think we can change dictionaries to use them. Having a Key objects might not be a bad addition on it's own. It's a fundamental data object (like a lisp pair) that may be useful for creating other data objects. It would be like a named tuple except the value is mutable. And direct comparisons are done on the immutable key, not the mutable value. It may be possible to do... for key in a_set: assert key in a_set print(key.name, key.value) And... for key in a_set: key.value = next(data) vs for key in a_dict: a_dict[key] = next(data) I think the set version is easier to read and understand, but the dict version is probably faster and more efficient. If there was a syntax for defining a key... {name:value, ...} Oops dict, not set. ;-) {name:=value, ...} set of keys? Cheers, Ron
participants (16)
-
Andrew Barnert
-
Ben Finney
-
Chris Angelico
-
Joonas Liik
-
Lennart Regebro
-
MRAB
-
Neil Girdhar
-
Nick Coghlan
-
Petr Viktorin
-
Pierre Quentel
-
Rob Cliffe
-
Ron Adam
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy