[UPDATE] PEP 274, Dict Comprehensions

Attached is the latest version of PEP 274 for dictionary comprehensions, similar to list comprehensions. The first version of this PEP wasn't posted here, but several people saw the cvs checkin and sent in comments, which have now been incorporated. Enjoy, -Barry -------------------- snip snip -------------------- PEP: 274 Title: Dict Comprehensions Version: $Revision: 1.2 $ Last-Modified: $Date: 2001/10/29 18:46:59 $ Author: barry@zope.com (Barry A. Warsaw) Status: Draft Type: Standards Track Created: 25-Oct-2001 Python-Version: 2.3 Post-History: 29-Oct-2001 Abstract PEP 202 introduces a syntactical extension to Python called the "list comprehension"[1]. This PEP proposes a similar syntactical extension called the "dictionary comprehension" or "dict comprehension" for short. You can use dict comprehensions in ways very similar to list comprehensions, except that they produce Python dictionary objects instead of list objects. Proposed Solution Dict comprehensions are just like list comprehensions, except that you group the expression using curly braces instead of square braces. Also, the left part before the `for' keyword expresses both a key and a value, separated by a colon. (There is an optional part of this PEP that allows you to use a shortcut to express just the value.) The notation is specifically designed to remind you of list comprehensions as applied to dictionaries. Rationale There are times when you have some data arranged as a sequences of length-2 sequences, and you want to turn that into a dictionary. In Python 2.2, the dictionary() constructor will take an optional keyword argument that indicates specifically to interpret a sequences of length-2 sequences as key/value pairs, and turn them into a dictionary. However, the act of turning some data into a sequence of length-2 sequences can be inconvenient or inefficient from a memory or performance standpoint. Also, for some common operations, such as turning a list of things into a set of things for quick duplicate removal or set inclusion tests, a better syntax can help code clarity. As with list comprehensions, an explicit for loop can always be used (and in fact was the only way to do it in earlier versions of Python). But as with list comprehensions, dict comprehensions can provide a more syntactically succinct idiom that the traditional for loop. Examples >>> print {i : chr(65+i) for i in range(4)} {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'} >>> print {k : v for k, v in someDict.iteritems()} == someDict.copy() 1 >>> print {x.lower() : 1 for x in list_of_email_addrs} {'barry@zope.com' : 1, 'barry@python.org' : 1, 'guido@python.org' : 1} >>> def invert(d): ... return {v : k for k, v in d.iteritems()} ... >>> d = {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'} >>> print invert(d) {'A' : 0, 'B' : 1, 'C' : 2, 'D' : 3} Open Issues - There is one further shortcut we could adopt. Suppose we wanted to create a set of items, such as in the "list_of_email_addrs" example above. Here, we're simply taking the target of the for loop and turning that into the key for the dict comprehension. The assertion is that this would be a common idiom, so the shortcut below allows for an easy spelling of it, by allow us to omit the "key :" part of the left hand clause: >>> print {1 for x in list_of_email_addrs} {'barry@zope.com' : 1, 'barry@python.org' : 1, 'guido@python.org' : 1} Or say we wanted to map email addresses to the MX record handling their mail: >>> print {mx_for_addr(x) for x in list_of_email_addrs} {'barry@zope.com' : 'mail.zope.com', 'barry@python.org' : 'mail.python.org, 'guido@python.org' : 'mail.python.org, } Questions: what about nested loops? Where does the key come from? The shortcut probably doesn't save much typing, and comes at the expense of legibility, so it's of dubious value. - Should nested for loops be allowed? The following example, taken from an earlier revision of this PEP illustrates the problem: >>> print {k, v for k in range(4) for v in range(-4, 0, 1)} The intent of this example was to produce a mapping from a number to its negative, but this code doesn't work because -- as in list comprehensions -- the for loops are nested, not in parallel! So the value of this expression is actually {0: -1, 1: -1, 2: -1, 3: -1} which seems of dubious value. For symmetry with list comprehensions, perhaps this should be allowed, but it might be better to disallow this syntax. Implementation The semantics of dictionary comprehensions can actually be modeled in stock Python 2.2, by passing a list comprehension to the builtin dictionary constructor: >>> dictionary([(i, chr(65+i)) for i in range(4)]) This has two dictinct disadvantages from the proposed syntax though. First, it's isn't as legible as a dict comprehension. Second, it forces the programmer to create an in-core list object first, which could be expensive. References [1] PEP 202, List Comprehensions http://www.python.org/peps/pep-0202.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil fill-column: 70 End:

Hm, I don't like this. I think it's confusing: you really have to think about whether {x for x in <whatever>} produces a dictionary whose keys are in <whatever> or one whose values are in <whatever>.
Don't you mean {k: v for ...}? The second range() also seems like you meant range(0, -4, -1).
Nested for loops can be useful when used properly: {(k, v): k+v for k in range(4) for v in range(4)} Your example should have been expressed using: {k: v for k, v in zip(range(4), range(0, -4, -1))} --Guido van Rossum (home page: http://www.python.org/~guido/)

[Barry]
This is implemented now, but in a different way. Suggested rewording: In Python 2.2, the dictionary() constructor accepts an argument that is a sequence of length-2 sequences, used as (key, value) pairs to initialize a new dictionary object. BTW, and not meaning to hijack your PEP <wink>, should dict.update() accept such an argument too? I didn't add it because d.update(dictionary(such_an_argument)) seemed "almost good enough". BTW2, are we going to rename "dictionary" to "dict" before 2.2b2? Before 2.2, "dict" was universally used on c.l.py to mean dictionary, and I'm at least +0 on adopting that for official 2.2 use.

"TP" == Tim Peters <tim.one@home.com> writes:
TP> BTW, and not meaning to hijack your PEP <wink>, should TP> dict.update() accept such an argument too? I didn't add it TP> because TP> d.update(dictionary(such_an_argument)) TP> seemed "almost good enough". Agreed. But either way, I still there there is utility in a dict comprehension. TP> BTW2, are we going to rename "dictionary" to "dict" before TP> 2.2b2? Before 2.2, "dict" was universally used on c.l.py to TP> mean dictionary, and I'm at least +0 on adopting that for TP> official 2.2 use. I wouldn't keep them both though. Use one or the other. -Barry

Agreed; no need to add this to update().
Sounds good to me. Should we adopt "dict" as an alias and keep "dictionary" as the official name, or vice versa, or simply eradicate dictionary and introduce dict in 2.2b2? It's up to you to implement this, I have some other things I need to get done first. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Tim]
[Guido]
s/dictionary/dict/g is my preference: TOOWTDI.
It's up to you to implement this, I have some other things I need to get done first.
That's fine -- I can't think of anything needed I don't know how to do with ease (heck, I even know where the dictionary() docstring lives <wink>), except for changing the descr tutorial accordingly on python.org.

[Tim]
s/dictionary/dict/g is my preference: TOOWTDI.
[Guido] OK, let's do that.
OK, I'll fix that one. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org>:
I'd say eradicate "dictionary". There Should Only Be One Way To Spell It. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

+1 generally on dict comprehensions, but:
print {1 for x in list_of_email_addrs} {'barry@zope.com' : 1, 'barry@python.org' : 1, 'guido@python.org' : 1}
-1 on this bit. It's not at all clear what it should mean, and the saving over writing it out explicitly, i.e. {x:1 for x in list_of_email_addrs} is vanishingly small. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

You're getting better at channeling me. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

Hm, I don't like this. I think it's confusing: you really have to think about whether {x for x in <whatever>} produces a dictionary whose keys are in <whatever> or one whose values are in <whatever>.
Don't you mean {k: v for ...}? The second range() also seems like you meant range(0, -4, -1).
Nested for loops can be useful when used properly: {(k, v): k+v for k in range(4) for v in range(4)} Your example should have been expressed using: {k: v for k, v in zip(range(4), range(0, -4, -1))} --Guido van Rossum (home page: http://www.python.org/~guido/)

[Barry]
This is implemented now, but in a different way. Suggested rewording: In Python 2.2, the dictionary() constructor accepts an argument that is a sequence of length-2 sequences, used as (key, value) pairs to initialize a new dictionary object. BTW, and not meaning to hijack your PEP <wink>, should dict.update() accept such an argument too? I didn't add it because d.update(dictionary(such_an_argument)) seemed "almost good enough". BTW2, are we going to rename "dictionary" to "dict" before 2.2b2? Before 2.2, "dict" was universally used on c.l.py to mean dictionary, and I'm at least +0 on adopting that for official 2.2 use.

"TP" == Tim Peters <tim.one@home.com> writes:
TP> BTW, and not meaning to hijack your PEP <wink>, should TP> dict.update() accept such an argument too? I didn't add it TP> because TP> d.update(dictionary(such_an_argument)) TP> seemed "almost good enough". Agreed. But either way, I still there there is utility in a dict comprehension. TP> BTW2, are we going to rename "dictionary" to "dict" before TP> 2.2b2? Before 2.2, "dict" was universally used on c.l.py to TP> mean dictionary, and I'm at least +0 on adopting that for TP> official 2.2 use. I wouldn't keep them both though. Use one or the other. -Barry

Agreed; no need to add this to update().
Sounds good to me. Should we adopt "dict" as an alias and keep "dictionary" as the official name, or vice versa, or simply eradicate dictionary and introduce dict in 2.2b2? It's up to you to implement this, I have some other things I need to get done first. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Tim]
[Guido]
s/dictionary/dict/g is my preference: TOOWTDI.
It's up to you to implement this, I have some other things I need to get done first.
That's fine -- I can't think of anything needed I don't know how to do with ease (heck, I even know where the dictionary() docstring lives <wink>), except for changing the descr tutorial accordingly on python.org.

[Tim]
s/dictionary/dict/g is my preference: TOOWTDI.
[Guido] OK, let's do that.
OK, I'll fix that one. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org>:
I'd say eradicate "dictionary". There Should Only Be One Way To Spell It. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

+1 generally on dict comprehensions, but:
print {1 for x in list_of_email_addrs} {'barry@zope.com' : 1, 'barry@python.org' : 1, 'guido@python.org' : 1}
-1 on this bit. It's not at all clear what it should mean, and the saving over writing it out explicitly, i.e. {x:1 for x in list_of_email_addrs} is vanishingly small. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

You're getting better at channeling me. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (5)
-
barry@zope.com
-
Fred L. Drake, Jr.
-
Greg Ewing
-
Guido van Rossum
-
Tim Peters