Occasionally I find myself wanting to unpack the values of a dictionary into local variables of a function. This most often occurs when marshalling values to/from some serialization format.
For example:
def do_stuff_from_json(json_dict):
actual_dict = json.loads(json_dict)
foo = actual_dict['foo']
bar = actual_dict['bar']
# Do stuff with foo and bar.
In the same spirit as allowing argument unpacking into tuples or lists, what I'd really like to be able write is something like:
def do_stuff_from_json(json_dict):
# Assigns variables in the **values** of the lefthand side by doing lookups
# of the corresponding keys in the result of the righthand side expression.
{'foo': foo, 'bar': bar} = json.loads(json_dict)
Nearly all the arguments in favor of tuple/list unpacking also apply to this construct. In particular:
1. It makes the code more self-documenting, in that the left side of the expression looks more like the expected output of the right side.
2. The construct can be implemented more efficiently by the interpreter by using a dictionary analog of the UNPACK_SEQUENCE opcode (e.g. UNPACK_MAP).
An interesting question that falls out of this idea is whether/how we should handle nested structures. I'd expect the rule to be that something like:
{'toplevel': {'key1': key1, 'key2': key2}} = value
would desugar into something equivalent to:
TEMP = value['toplevel']
key1 = TEMP['key1']
key2 = TEMP['key2']
del TEMP
while something like
{'toplevel': (x, y)} = value
would desugar into something like:
(x, y) = value['toplevel']
At the bytecode level, I'd expect this to be implemented with a new instruction, analogous to the current UNPACK_SEQUENCE, which would pop N keys and a map from the stack, and push map[key] onto the stack for each popped key. We'd then recurse through the values left on the stack, storing them as we would store the sub-lvalues if they were in a standard assignment. Thus the code for something like:
{'name': name, 'tuple': (x, y), 'dict': {'subkey': subvalue}} = values
would translate into the following "pseudo-bytecode":
LOAD_NAME 'values' # Push rvalue onto the stack.
LOAD_CONST 'dict' # Push top-level keys onto the stack.
LOAD_CONST 'tuple'
LOAD_CONST 'name'
UNPACK_MAP 3 # Unpack keys. Pops values and all keys from the stack.
# TOS = values['name']
# TOS1 = values['tuple']
# TOS2 = values['dict']
STORE_FAST name # Terminal names are simply stored.
UNPACK_SEQUENCE 2 # Push the two entries in values['tuple'] onto the stack.
# TOS = values['tuple'][0]
# TOS1 = values['tuple'][1]
# TOS2 = values['dict']
STORE_FAST x
STORE_FAST y
LOAD_CONST 'subkey' # TOS = 'subkey'
# TOS1 = values['dict']
UNPACK_MAP 1 # TOS = values['dict']['subkey']
STORE_FAST subvalue
I'd be curious to hear others' thoughts on whether this seems like a reasonable idea. One open question is whether non-literals should be allowed as keys in dictionaries (the above still works as expected if the keys are allowed to be names or expressions; the LOAD_CONSTs would turn into whatever expression or LOAD_* is necessary to put the necessary value on the stack). Another question is if/how we should handle extra keys in right-hand side of the assignment (my guess is that we shouldn't do anything special with that case).
-Scott