Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues)
Hi All, Occasionally I find myself wanting to unpack the values of a dictionary into local variables of a function. This most often occurs when marshalling values to/from some serialization format. For example: def do_stuff_from_json(json_dict): actual_dict = json.loads(json_dict) foo = actual_dict['foo'] bar = actual_dict['bar'] # Do stuff with foo and bar. In the same spirit as allowing argument unpacking into tuples or lists, what I'd really like to be able write is something like: def do_stuff_from_json(json_dict): # Assigns variables in the **values** of the lefthand side by doing lookups # of the corresponding keys in the result of the righthand side expression. {'foo': foo, 'bar': bar} = json.loads(json_dict) Nearly all the arguments in favor of tuple/list unpacking also apply to this construct. In particular: 1. It makes the code more self-documenting, in that the left side of the expression looks more like the expected output of the right side. 2. The construct can be implemented more efficiently by the interpreter by using a dictionary analog of the UNPACK_SEQUENCE opcode (e.g. UNPACK_MAP). An interesting question that falls out of this idea is whether/how we should handle nested structures. I'd expect the rule to be that something like: {'toplevel': {'key1': key1, 'key2': key2}} = value would desugar into something equivalent to: TEMP = value['toplevel'] key1 = TEMP['key1'] key2 = TEMP['key2'] del TEMP while something like {'toplevel': (x, y)} = value would desugar into something like: (x, y) = value['toplevel'] At the bytecode level, I'd expect this to be implemented with a new instruction, analogous to the current UNPACK_SEQUENCE, which would pop N keys and a map from the stack, and push map[key] onto the stack for each popped key. We'd then recurse through the values left on the stack, storing them as we would store the sub-lvalues if they were in a standard assignment. Thus the code for something like: {'name': name, 'tuple': (x, y), 'dict': {'subkey': subvalue}} = values would translate into the following "pseudo-bytecode": LOAD_NAME 'values' # Push rvalue onto the stack. LOAD_CONST 'dict' # Push top-level keys onto the stack. LOAD_CONST 'tuple' LOAD_CONST 'name' UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. # TOS = values['name'] # TOS1 = values['tuple'] # TOS2 = values['dict'] STORE_FAST name # Terminal names are simply stored. UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack. # TOS = values['tuple'][0] # TOS1 = values['tuple'][1] # TOS2 = values['dict'] STORE_FAST x STORE_FAST y LOAD_CONST 'subkey' # TOS = 'subkey' # TOS1 = values['dict'] UNPACK_MAP 1 # TOS = values['dict']['subkey'] STORE_FAST subvalue I'd be curious to hear others' thoughts on whether this seems like a reasonable idea. One open question is whether non-literals should be allowed as keys in dictionaries (the above still works as expected if the keys are allowed to be names or expressions; the LOAD_CONSTs would turn into whatever expression or LOAD_* is necessary to put the necessary value on the stack). Another question is if/how we should handle extra keys in right-hand side of the assignment (my guess is that we shouldn't do anything special with that case). -Scott P.S. I attempted to post this last night, but it seems to have not gone through. Apologies for the double post if I'm mistaken about that.
On Wed, Aug 12, 2015, at 09:57, Scott Sanderson wrote:
def do_stuff_from_json(json_dict): # Assigns variables in the **values** of the lefthand side by doing lookups # of the corresponding keys in the result of the righthand side expression. {'foo': foo, 'bar': bar} = json.loads(json_dict)
How about: key = 'foo' key2 = 'bar' {key: value, key2: value2, **rest} = json.loads(json_dict)
From a language design standpoint I think that having non-constant keys in the unpack map makes a lot of sense. As far as implementation, I would imagine that using non-constant expressions for the keys should be fine. If you look at the proposed implementation, the UNPACK_MAP instruction just wants the stack to have N values on the stack, it shouldn't matter how they got there.
On Wed, Aug 12, 2015 at 11:41 AM, <random832@fastmail.us> wrote:
On Wed, Aug 12, 2015, at 09:57, Scott Sanderson wrote:
def do_stuff_from_json(json_dict): # Assigns variables in the **values** of the lefthand side by doing lookups # of the corresponding keys in the result of the righthand side expression. {'foo': foo, 'bar': bar} = json.loads(json_dict)
How about:
key = 'foo' key2 = 'bar' {key: value, key2: value2, **rest} = json.loads(json_dict) _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Aug 12, 2015 at 11:46:10AM -0400, Joseph Jevnik wrote:
From a language design standpoint I think that having non-constant keys in the unpack map makes a lot of sense.
mydict = {'spam': 1, 'eggs': 2} spam = 'eggs' eggs = 99 {spam: spam} = mydict print(spam, eggs) What gets printed? I can only guess that you want it to print eggs 1 rather than 1 99 but I can't be sure. I am reasonably sure that whatever you pick, it will surprise some people. It will also play havok with CPython's local variable optimization, since the compiler cannot tell what the name of the local will be: def func(): mydict = dict(foo=1, bar=2, baz=3) spam = random.choice(['foo', 'bar', 'baz']) {spam: spam} = mydict # which locals exist at this point? -- Steve
Steven D'Aprano wrote:
On Wed, Aug 12, 2015 at 11:46:10AM -0400, Joseph Jevnik wrote:
From a language design standpoint I think that having non-constant keys in the unpack map makes a lot of sense.
mydict = {'spam': 1, 'eggs': 2} spam = 'eggs' eggs = 99 {spam: spam} = mydict print(spam, eggs)
What gets printed? I can only guess that you want it to print
eggs 1
rather than
1 99
why? replacing bound variables with the literal values we have {spam:spam} equals to {'eggs':spam} mydict equals to {'spam': 1, 'eggs': 2} the original assignement {spam:spam} = mydict is equivalent to write {'eggs': spam} = {'spam': 1, 'eggs': 2} this form of desugaring rougly wants to be read as "write in the variable 'spam' the value looked up in the {'spam':1,'eggs':2} dict with the key 'eggs'" or spam = {'spam':1,'eggs':2}['eggs'] = 2 the 'variable' eggs is not touched at all in this assignment, so print(spam, eggs) "prints" `2 99`
but I can't be sure. I am reasonably sure that whatever you pick, it will surprise some people. It will also play havok with CPython's local variable optimization, since the compiler cannot tell what the name of the local will be:
def func(): mydict = dict(foo=1, bar=2, baz=3) spam = random.choice(['foo', 'bar', 'baz']) {spam: spam} = mydict # which locals exist at this point?
the 'name' of the local is spam; the value is one of 1, 2 or 3 for what I can see {'x1': y1, 'x2': y2, 'x3': y3 } = z can be translated to y1 = z['x1'] y2 = z['x2'] y3 = z['x3'] -- By ZeD
On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson <scott.b.sanderson90@gmail.com> wrote:
LOAD_NAME 'values' # Push rvalue onto the stack. LOAD_CONST 'dict' # Push top-level keys onto the stack. LOAD_CONST 'tuple' LOAD_CONST 'name' UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. # TOS = values['name'] # TOS1 = values['tuple'] # TOS2 = values['dict']
STORE_FAST name # Terminal names are simply stored.
UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack. # TOS = values['tuple'][0] # TOS1 = values['tuple'][1] # TOS2 = values['dict'] STORE_FAST x STORE_FAST y
This sounds reasonable in theory; is it going to have problems with the non-orderedness of dictionaries? With sequence unpacking, it's straight-forward - you evaluate things in a known order, you iterate over the thing, you assign. In this case, you might end up with some bizarre stack manipulation needed to make this work. Inside that UNPACK_MAP opcode, arbitrary code could be executed (imagine if the RHS is not a dict per se, but an object with a __getitem__ method), so it'll need to be popping some things off and pushing others on, and presumably would need to know what goes where. Unless, of course, this doesn't "pop" and "push", but does some sort of replacement. Suppose you load the keys first, and only once those are loaded, you load the rvalue - so the rvalue is on the top of the stack. "UNPACK_MAP 3" means this: 1) Pop the top item off the stack - it is the map we're working with. 2) Reach 3 items down in the stack. Take that item, subscript our map with it, and replace that stack entry with the result. 3) Reach 2 items down, subscript, replace. Repeat till we subscript with the top of the stack. I've no idea how plausible that is, but it'd kinda work. It would also mean you could evaluate the keys in the order that they're shown in the dict display *and* assign to them in that order, which the current proposal doesn't do (it assigns in order, but evaluates in reverse order). Stupid, unworkable idea? ChrisA
If you look carefully at the way the stack is setup, we are not iterating over the map, instead we are executing a sequence of PyObject_GetItem calls in the execution of the opcode and then pushing the results back onto the stack. The order of the results is based on the order of keys that were on the stack. On Wed, Aug 12, 2015 at 12:10 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson <scott.b.sanderson90@gmail.com> wrote:
LOAD_NAME 'values' # Push rvalue onto the stack. LOAD_CONST 'dict' # Push top-level keys onto the stack. LOAD_CONST 'tuple' LOAD_CONST 'name' UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. # TOS = values['name'] # TOS1 = values['tuple'] # TOS2 = values['dict']
STORE_FAST name # Terminal names are simply stored.
UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack. # TOS = values['tuple'][0] # TOS1 = values['tuple'][1] # TOS2 = values['dict'] STORE_FAST x STORE_FAST y
This sounds reasonable in theory; is it going to have problems with the non-orderedness of dictionaries? With sequence unpacking, it's straight-forward - you evaluate things in a known order, you iterate over the thing, you assign. In this case, you might end up with some bizarre stack manipulation needed to make this work. Inside that UNPACK_MAP opcode, arbitrary code could be executed (imagine if the RHS is not a dict per se, but an object with a __getitem__ method), so it'll need to be popping some things off and pushing others on, and presumably would need to know what goes where.
Unless, of course, this doesn't "pop" and "push", but does some sort of replacement. Suppose you load the keys first, and only once those are loaded, you load the rvalue - so the rvalue is on the top of the stack. "UNPACK_MAP 3" means this:
1) Pop the top item off the stack - it is the map we're working with. 2) Reach 3 items down in the stack. Take that item, subscript our map with it, and replace that stack entry with the result. 3) Reach 2 items down, subscript, replace. Repeat till we subscript with the top of the stack.
I've no idea how plausible that is, but it'd kinda work. It would also mean you could evaluate the keys in the order that they're shown in the dict display *and* assign to them in that order, which the current proposal doesn't do (it assigns in order, but evaluates in reverse order).
Stupid, unworkable idea?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Steven, in your example you would get `2 99`. Also, you can always tell what the name of the local is. I think the bytecode example that scott showed was a pretty clear implementation example. Also, the idea of this is not that the variable has the name name as the key, but that you can pull the values out of a mapping and name them. This means that the idea of updating the locals with the dict is not really the same idea. That also forces you to use all or none of the keye instead of allowing you to take a subset. On Wed, Aug 12, 2015 at 12:52 PM, Joseph Jevnik <joejev@gmail.com> wrote:
If you look carefully at the way the stack is setup, we are not iterating over the map, instead we are executing a sequence of PyObject_GetItem calls in the execution of the opcode and then pushing the results back onto the stack. The order of the results is based on the order of keys that were on the stack.
On Wed, Aug 12, 2015 at 12:10 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson <scott.b.sanderson90@gmail.com> wrote:
LOAD_NAME 'values' # Push rvalue onto the stack. LOAD_CONST 'dict' # Push top-level keys onto the stack. LOAD_CONST 'tuple' LOAD_CONST 'name' UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. # TOS = values['name'] # TOS1 = values['tuple'] # TOS2 = values['dict']
STORE_FAST name # Terminal names are simply stored.
UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack. # TOS = values['tuple'][0] # TOS1 = values['tuple'][1] # TOS2 = values['dict'] STORE_FAST x STORE_FAST y
This sounds reasonable in theory; is it going to have problems with the non-orderedness of dictionaries? With sequence unpacking, it's straight-forward - you evaluate things in a known order, you iterate over the thing, you assign. In this case, you might end up with some bizarre stack manipulation needed to make this work. Inside that UNPACK_MAP opcode, arbitrary code could be executed (imagine if the RHS is not a dict per se, but an object with a __getitem__ method), so it'll need to be popping some things off and pushing others on, and presumably would need to know what goes where.
Unless, of course, this doesn't "pop" and "push", but does some sort of replacement. Suppose you load the keys first, and only once those are loaded, you load the rvalue - so the rvalue is on the top of the stack. "UNPACK_MAP 3" means this:
1) Pop the top item off the stack - it is the map we're working with. 2) Reach 3 items down in the stack. Take that item, subscript our map with it, and replace that stack entry with the result. 3) Reach 2 items down, subscript, replace. Repeat till we subscript with the top of the stack.
I've no idea how plausible that is, but it'd kinda work. It would also mean you could evaluate the keys in the order that they're shown in the dict display *and* assign to them in that order, which the current proposal doesn't do (it assigns in order, but evaluates in reverse order).
Stupid, unworkable idea?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
This sounds reasonable in theory; is it going to have problems with the non-orderedness of dictionaries?
You can make this deterministic by always iterating over the LHS keys in declaration order. Expressed another way, dictionary **literals** can be ordered, even if dictionaries themselves are not ordered at runtime. We only need to have a well-defined order when generating opcodes at compile time. I think you were getting at this same idea with your proposal as well. Unless, of course, this doesn't "pop" and "push", but does some sort
of replacement. Suppose you load the keys first, and only once those are loaded, you load the rvalue - so the rvalue is on the top of the stack. "UNPACK_MAP 3" means this:
I think you could make this work with either stack ordering; the compiler would be generating both the UNPACK_* calls and the LOAD_* calls all at once, so it could decide to order them however the interpreter found it most convenient to work with. I think there are already opcodes that operate on the top N elements of the stack, so whether we're actually doing true pushes and pops is just an implementation detail. I do agree that the accesses should happen in the order that the keys appear in the LHS, and I'd expect nested structures to be traversed depth-first, which would matter if the same leaf name appeared in multiple places. This would be analogous to the fact that a, a = (1, 2) results in the value of a being 2. -Scott On Wednesday, August 12, 2015 at 12:10:43 PM UTC-4, Chris Angelico wrote:
On Wed, Aug 12, 2015 at 11:57 PM, Scott Sanderson <scott.b.s...@gmail.com <javascript:>> wrote:
LOAD_NAME 'values' # Push rvalue onto the stack. LOAD_CONST 'dict' # Push top-level keys onto the stack. LOAD_CONST 'tuple' LOAD_CONST 'name' UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack. # TOS = values['name'] # TOS1 = values['tuple'] # TOS2 = values['dict']
STORE_FAST name # Terminal names are simply stored.
UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack. # TOS = values['tuple'][0] # TOS1 = values['tuple'][1] # TOS2 = values['dict'] STORE_FAST x STORE_FAST y
This sounds reasonable in theory; is it going to have problems with the non-orderedness of dictionaries? With sequence unpacking, it's straight-forward - you evaluate things in a known order, you iterate over the thing, you assign. In this case, you might end up with some bizarre stack manipulation needed to make this work. Inside that UNPACK_MAP opcode, arbitrary code could be executed (imagine if the RHS is not a dict per se, but an object with a __getitem__ method), so it'll need to be popping some things off and pushing others on, and presumably would need to know what goes where.
Unless, of course, this doesn't "pop" and "push", but does some sort of replacement. Suppose you load the keys first, and only once those are loaded, you load the rvalue - so the rvalue is on the top of the stack. "UNPACK_MAP 3" means this:
1) Pop the top item off the stack - it is the map we're working with. 2) Reach 3 items down in the stack. Take that item, subscript our map with it, and replace that stack entry with the result. 3) Reach 2 items down, subscript, replace. Repeat till we subscript with the top of the stack.
I've no idea how plausible that is, but it'd kinda work. It would also mean you could evaluate the keys in the order that they're shown in the dict display *and* assign to them in that order, which the current proposal doesn't do (it assigns in order, but evaluates in reverse order).
Stupid, unworkable idea?
ChrisA _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Aug 12, 2015 at 09:57:03AM -0400, Scott Sanderson wrote:
Hi All,
Occasionally I find myself wanting to unpack the values of a dictionary into local variables of a function. This most often occurs when marshalling values to/from some serialization format.
I think that anything that is only needed "occasionally" doesn't have a strong claim to deserve syntax.
For example:
def do_stuff_from_json(json_dict): actual_dict = json.loads(json_dict) foo = actual_dict['foo'] bar = actual_dict['bar'] # Do stuff with foo and bar.
Seems reasonable and not too much of a burden to me. If I needed a lot of keys, I'd do: # sequence unpacking version spam, eggs, cheese, foo, bar, baz = [actual_dict[key] for key in "spam eggs cheese foo bar baz".split()]
In the same spirit as allowing argument unpacking into tuples or lists, what I'd really like to be able write is something like:
def do_stuff_from_json(json_dict): # Assigns variables in the **values** of the lefthand side by doing lookups # of the corresponding keys in the result of the righthand side expression. {'foo': foo, 'bar': bar} = json.loads(json_dict)
I think the sequence unpacking version above reads much better than this hypothetical dict unpacking version: {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict) Both are roughly as verbose, both have a little duplication, but the sequence unpacking version requires far fewer quotation marks and other punctuation. I also think it's much more readable, and of course the big advantage of it is that it works right now, you don't have to wait two or three years to start using it in production. If there is a downside to the sequence unpacking version, it is that it requires a temporary variable actual_dict, but that's not a real problem. I don't think dict unpacking is needed when you have only two or three variables, and I don't think your suggested syntax is readable when you have many variables. So I would be -1 on this suggestion. However, if you wanted to think outside the box, it's a pity that locals() is not writable. If it were, we could do: locals().update(json.loads(json_dict)) although of course that might update too many local names. So, just throwing it out there for discussion: - Make locals() writable. If the compiler detects that locals() may be written to, that will have to disable the fast local variable access for that specific function. More practical, and in the spirit of tuple unpacking: spam, eggs, cheese = **expression being equivalent to: _tmp = expression spam = _tmp['spam'] eggs = _tmp['eggs'] cheese = _tmp['cheese'] del _tmp except that _tmp is never actually created/deleted. This is easier to write and simpler to read, and doesn't allow nested unpacking. (I consider that last point to be a positive feature, not a lack.) -- Steve
I think that anything that is only needed "occasionally" doesn't have a strong claim to deserve syntax.
I suppose I might have worded this more strongly. I meant "occasionally" to be interpreted as "often enough that I've been irked by not having a cleaner way to express this construct". I do appreciate the fact that syntax extensions have a real cost, and that they should be reserved for cases where the additional clarity and/or economy of expression outweighs the cost of implementation, maintenance, and (perhaps most importantly) teaching the language to others. I think the sequence unpacking version above reads much better than
this hypothetical dict unpacking version: {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict)
As with many things in Python, I think that how you format this expression makes a big difference. I'd write it like this: { 'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict) I prefer this to your example of unpacking from a list comprehension because I think it does a better job of expressing to a reader the expected structure of the input data. It's also much easier to modify this to extract nested values, ala: { 'foo': foo, 'bar': bar, 'baz': (baz_x, baz_y), 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict) However, if you wanted to think outside the box, it's a pity that locals()
is not writable. If it were, we could do: locals().update(json.loads(json_dict))
locals() is already writable in certain contexts, most notably in class bodies. This works fine, for example: In [1]: class Foo(object): ...: locals().update({'foo': lambda self: 3}) ...: In [2]: Foo().foo() Out[2]: 3 locals() is not writable, as you point out, in function calls. However, I'm not sure that having a mutable locals is a good solution to this problem. As mentioned in the original post, I most often want to do this in contexts where I'm unpacking serialized data, in which case it's probably not a great idea to have that data trample your namespace with no restrictions. More practical, and in the spirit of tuple unpacking:
spam, eggs, cheese = **expression
I like how concise this syntax is. I'd be sad that it doesn't allow unpacking of nested expressions, though I think we disagree on whether that's actually an issue. A more substantial objection might be that this could only work on mapping objects with strings for keys. - Scott On Wednesday, August 12, 2015 at 12:38:47 PM UTC-4, Steven D'Aprano wrote:
On Wed, Aug 12, 2015 at 09:57:03AM -0400, Scott Sanderson wrote:
Hi All,
Occasionally I find myself wanting to unpack the values of a dictionary into local variables of a function. This most often occurs when marshalling values to/from some serialization format.
I think that anything that is only needed "occasionally" doesn't have a strong claim to deserve syntax.
For example:
def do_stuff_from_json(json_dict): actual_dict = json.loads(json_dict) foo = actual_dict['foo'] bar = actual_dict['bar'] # Do stuff with foo and bar.
Seems reasonable and not too much of a burden to me. If I needed a lot of keys, I'd do:
# sequence unpacking version spam, eggs, cheese, foo, bar, baz = [actual_dict[key] for key in "spam eggs cheese foo bar baz".split()]
In the same spirit as allowing argument unpacking into tuples or lists, what I'd really like to be able write is something like:
def do_stuff_from_json(json_dict): # Assigns variables in the **values** of the lefthand side by doing lookups # of the corresponding keys in the result of the righthand side expression. {'foo': foo, 'bar': bar} = json.loads(json_dict)
I think the sequence unpacking version above reads much better than this hypothetical dict unpacking version:
{'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict)
Both are roughly as verbose, both have a little duplication, but the sequence unpacking version requires far fewer quotation marks and other punctuation. I also think it's much more readable, and of course the big advantage of it is that it works right now, you don't have to wait two or three years to start using it in production.
If there is a downside to the sequence unpacking version, it is that it requires a temporary variable actual_dict, but that's not a real problem.
I don't think dict unpacking is needed when you have only two or three variables, and I don't think your suggested syntax is readable when you have many variables. So I would be -1 on this suggestion.
However, if you wanted to think outside the box, it's a pity that locals() is not writable. If it were, we could do:
locals().update(json.loads(json_dict))
although of course that might update too many local names. So, just throwing it out there for discussion:
- Make locals() writable. If the compiler detects that locals() may be written to, that will have to disable the fast local variable access for that specific function.
More practical, and in the spirit of tuple unpacking:
spam, eggs, cheese = **expression
being equivalent to:
_tmp = expression spam = _tmp['spam'] eggs = _tmp['eggs'] cheese = _tmp['cheese'] del _tmp
except that _tmp is never actually created/deleted.
This is easier to write and simpler to read, and doesn't allow nested unpacking. (I consider that last point to be a positive feature, not a lack.)
-- Steve _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Aug 12, 2015, at 11:44, Scott Sanderson <scoutoss@gmail.com> wrote:
I think that anything that is only needed "occasionally" doesn't have a strong claim to deserve syntax.
I suppose I might have worded this more strongly. I meant "occasionally" to be interpreted as "often enough that I've been irked by not having a cleaner way to express this construct"
Personally, I've been irked by not having a way to express generalized pattern matching more often than I've been irked by the fact that the limited pattern matching doesn't include dicts (to the point that my reasonable thorough but not seriously proposed idea for pattern matching didn't even the obvious way to fit dicts into the system and I didn't notice until someone else pointed it out). I don't know if that's because we're writing different code, or if I spend more time coming back to Python code with my brain still halfway on another language, or just what we find natural... Personally, I'd still rather have full pattern matching (including a protocol roughly akin to copy/pickle to let arbitrary types participate in matching), but I can see why others might find the special case more useful than the general one. As for nested dicts assignment (or nested dicts and tuples), my first reaction was that you're building something very complicated but still very limited if it can handle fully general nesting of mappings and sequences but can't handle any other kind of containment. But then I realized that the exact same thing is true of JSON, and that's turned out to be pretty useful. When I use YAML, I make lots of use of things like having datetimes as a native type, but use other kinds of containers (I think I've used a multidict extension once...). So maybe my gut reaction here is wrong.
locals() is not writable, as you point out, in function calls. However, I'm not sure that having a mutable locals is a good solution to this problem. As mentioned in the original post, I most often want to do this in contexts where I'm unpacking serialized data, in which case it's probably not a great idea to have that data trample your namespace with no restrictions.
What about exposing LocalsToFast to the language? Then, in the rare cases where you do want to mutate locals, you make it explicit--and it's also much more obvious that you have to name the variables somewhere and that you're potentially pessimizing the code.
More practical, and in the spirit of tuple unpacking: spam, eggs, cheese = **expression
I like how concise this syntax is. I'd be sad that it doesn't allow unpacking of nested expressions, though I think we disagree on whether that's actually an issue. A more substantial objection might be that this could only work on mapping objects with strings for keys.
The same substantial objection applies to the existing uses of **, both in passing a dict as keyword arguments and in capturing unbound keyword arguments. So, for example, you can't really pass a dict through a function call by passing and accepting **kw, and c = dict(a, **b) doesn't really merge two dicts--and yet it's still useful for that purpose in many cases.
On Wed, Aug 12, 2015 at 11:44:05AM -0700, Scott Sanderson wrote:
I think the sequence unpacking version above reads much better than this hypothetical dict unpacking version: {'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese} = json.loads(json_dict)
As with many things in Python, I think that how you format this expression makes a big difference. I'd write it like this:
{ 'foo': foo, 'bar': bar, 'baz': baz, 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict)
That's still awfully verbose, and not much of a saving from: foo = d['foo'] bar = d['bar'] baz = d['baz'] etc. You save a little bit of typing, but not that much.
I prefer this to your example of unpacking from a list comprehension because I think it does a better job of expressing to a reader the expected structure of the input data.
I don't think it does. I think the above would be incomprehensible to somebody who hasn't learned the details of this. It looks like you are creating a dict, but not assigning the dict to anything. And where do the unquoted values foo, bar, etc. come from? They look like they should come from already existing local variables: {'foo': foo} as an expression (rather than an assignment target) requires an existing foo variable (otherwise you get a NameError). So the behaviour has to be learned, it isn't something that the reader can extrapolate from other assignment syntax. It isn't obvious what this does: {foo: bar} = some_dict because there's no one-to-one correspondence between assignment target and assignment name. With sequence unpacking, the targets are obvious: foo, bar, baz = ... clearly has assignment targets foo, bar and baz. What else could they be? It's easy to extrapolate it from single assignment foo = ... But with your syntax, you have keys and values, and it isn't clear what gets used for what. The dict display form doesn't look like any other assignment target, you have to learn it as a special case. A reader who hasn't learned the rules could be forgiven for guessing any of the following rules: (1) create a variable foo from existing variable bar (2) create a variable foo from some_dict['bar'] (3) create a variable with the name given by the value of foo, from some_dict['bar'] (4) create a variable bar from some_dict['foo'] (5) create a variable with the name given by the value of bar, from some_dict['foo'] and others. You could make that a bit more clear by requiring the keys to be quoted, so {foo: bar} = ... would be illegal, and you have to write {'foo': 'bar'}, but that's annoying. Or we could go the other way and not quote anything: {foo: bar} = d could create variable foo from d['bar']. That's not bad looking, and avoids all the quote marks, but I don't think people would guess that's the behaviour. It still doesn't look like an assignment target. And the common case is still verbose: {foo: foo} = ... What if we have expressions in there? {foo.upper() + 's': bar} = some_dict {foo: bar or baz} = some_dict I would hope both of those are syntax errors! But maybe somebody will want them. At least, some people will expect them, because that sort of thing works in dict displays. You even hint at arbitrary values below, with a tuple (baz_x, baz_y).
It's also much easier to modify this to extract nested values, ala:
{ 'foo': foo, 'bar': bar, 'baz': (baz_x, baz_y), 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict)
So baz is a tuple of d['baz_x'], d['baz_y']? Does this mean you want to allow arbitrary expressions for the values? {'foo': func(foo or bar.upper() + "s") + baz} = d If so, what are the scoping rules? Which of func, foo, bar and baz are looked up from the right-hand side dict, and which are taken from the current scope? I think allowing arbitrary expressions cannot work in any reasonable manner, but special casing tuples (baz_x, baz_y) is too much of a special case.
More practical, and in the spirit of tuple unpacking:
spam, eggs, cheese = **expression
I like how concise this syntax is. I'd be sad that it doesn't allow unpacking of nested expressions, though I think we disagree on whether that's actually an issue. A more substantial objection might be that this could only work on mapping objects with strings for keys.
Does that mean that you expect your syntax to support non-identifier key lookups? {'foo': 123, 'bar': 'x y', 'baz': None} = d will look for keys 123 (or should that be '123'?), 'x y' and None (or possibly 'None')? If so, I think you've over-generalised from a fairly straightforward use-case: unpack key:values in a mapping to variables with the same name as the keys and YAGNI applies. For cases where the keys are not the same as the variables, or you want to use non-identifier keys, just use the good-old fashioned form: variable = d['some non-identifier'] -- Steve
On Thu, Aug 13, 2015 at 12:41 PM, Steven D'Aprano <steve@pearwood.info> wrote:
What if we have expressions in there?
{foo.upper() + 's': bar} = some_dict {foo: bar or baz} = some_dict
I would hope both of those are syntax errors! But maybe somebody will want them. At least, some people will expect them, because that sort of thing works in dict displays. You even hint at arbitrary values below, with a tuple (baz_x, baz_y).
It's also much easier to modify this to extract nested values, ala:
{ 'foo': foo, 'bar': bar, 'baz': (baz_x, baz_y), 'spam': spam, 'eggs': eggs, 'cheese': cheese, } = json.loads(json_dict)
So baz is a tuple of d['baz_x'], d['baz_y']?
Does this mean you want to allow arbitrary expressions for the values?
{'foo': func(foo or bar.upper() + "s") + baz} = d
If so, what are the scoping rules? Which of func, foo, bar and baz are looked up from the right-hand side dict, and which are taken from the current scope?
I think allowing arbitrary expressions cannot work in any reasonable manner, but special casing tuples (baz_x, baz_y) is too much of a special case.
baz would be a multiple assignment target. The way I understand this, the keys are ordinary expressions, and the 'values' are assignment targets, and can be nested just as sequence unpacking can:
x=[1,2,[3,4],5] a,b,(c,d),e = x a,b,c,d,e (1, 2, 3, 4, 5)
So 'baz': (baz_x, baz_y) would take d['baz'] and expect it to be a sequence of length 2. Arbitrary expressions in the values would be illogical, just as they are anywhere else:
foo or bar = 1 File "<stdin>", line 1 SyntaxError: can't assign to operator
Arbitrary expressions in the keys would make perfect sense, although I would hope they'd be rare. Whatever it evaluates to, that would be retrieved from the source object, and the result assigned to the corresponding target. The idea's internally consistent. I'm not convinced it's particularly useful, but it does hold water. ChrisA
participants (8)
-
Andrew Barnert
-
Chris Angelico
-
Joseph Jevnik
-
random832@fastmail.us
-
Scott Sanderson
-
Scott Sanderson
-
Steven D'Aprano
-
Vito De Tullio