[Python-ideas] Re: Proposal: Keyword Unpacking Shortcut [was Re: Keyword arguments self-assignment]

April 18, 2020

      On Sat, Apr 18, 2020 at 12:42:10AM -0700, Andrew Barnert wrote:
...
...
Inside function calls, the syntax
**{identifier [, ...]}
expands to a set of `identifier=identifier` argument bindings.
This will be legal anywhere inside a function call that keyword 
unpacking would be legal.
Which means that you can’t just learn ** unpacking as a single 
consistent thing that’s usable in multiple contexts with (almost) 
identical syntax and identical meaning, you have to learn that it has 
an additional syntax with a different meaning in just one specific 
context, calls, that’s not legal in the others.
Um, yes? I think.

I'm afraid your objection is unclear to me. Obviously this would be one 
more thing to learn, but if the benefit is large enough, it would be 
worthwhile.

It would also be true whether we spell it using the initial suggestion, 
or using mode-shift, or by adding a new way to create dicts:

    f(meta, dunder=, reverse=)
    f(meta, *, dunder, reverse)
    f(meta, **{:dunder, :reverse})
    f(meta, **{dunder, reverse})

I'm not really sure I understand your comment about dict unpacking 
being "usable in multiple contexts with (almost) identical syntax and 
identical meaning". Can you give some examples?

I know that dict unpacking works in function calls:

    f(**d)

and I know it doesn't work in assignments:

    a, b = **d  # What would this even mean?

or in list-displays, etc. It *does* work in dict-displays:

    d = {'a': None, **mapping}

but I either didn't know it, or had forgotten it, until I tested it just 
now. (It quite surprised me too.) Are there any other contexts where 
this would work?

There's probably no reason why this keyword shortcut couldn't be allowed 
in dict-displays too:

   d = {'a': None, **{b, c, d})

or even as a new dict "literal":

   d = **{meta, dunder, reverse}

if there is call for it.

Personally, I would be conservative about allowing it in other contexts, 
as we can always add it later, but it's much harder to remove it if it 
were a mistake. This proposal is only about allowing it in a single 
context, function calls.

[...]
...
Worse, this exact same syntax is a set display anywhere except in a ** 
in a call.
You say "worse", I say "Better!"

It's a feature that this looks something like a set: you can read it 
as "unpack this set of identifiers as parameter:value arguments".

It's a feature that it uses the same `**` double star as dict unpacking: 
you can read it as unpacking a dict where the values are implied.

It is hardly unprecedented that things which look similar are not always 
identical, especially when dealing with something as basic as a 
sequence of terms in a comma-separated list:

    math, sys, functools, itertools, os

It's a tuple! Except when inside parentheses directly following an 
expression, or an import statement:

    import math, sys, functools, itertools, os
    obj.attribute.method(math, sys, functools, itertools, os)
...
Not only is that another special case to learn about the 
differences between set and dict displays, it also means that if you 
naively copy and paste a subexpression from a call into somewhere else 
(say, to print the value of that dict), you don’t get what you wanted, 
or a syntax error, or even a runtime error, you get a perfectly valid 
but very different value.
If you naively copy and paste the curly bracket part:

    f(meta, **{dunder, reverse})
    print({dunder, reverse})

you get to see the values in a set. Is that such a problem that it 
should kill the syntax?

There's a limit to how naive a user we need to care about in the 
language. We don't have to care about preventing every possible user 
error.
...
...
On the other hand, plain keyword unpacking:
**textinfo
is terse, but perhaps too terse. Neither the keys nor the values are 
immediately visible. Instead, one must search the rest of the function 
or module for the definition of `textinfo` to learn which parameters are 
being filled in.
You can easily put the dict right before the call, and when you don’t, 
it’s usually because there was a good reason.
Right. I'm not saying that dict unpacking is a usability disaster. I'm 
just pointing out that it separates the parameters from where they are 
being used. Yes, it could be one line away, or it could be buried deeply 
a thousand lines away, imported from another module, which you don't 
have the source code to.

I intentionally gave a real (or at least, real-ish) example using a real 
function from the standard library, and real parameter names. Without 
looking in the docs, can you tell what parameters are supplied by the 
`**textinfo` unpacking? I know I can't, and I wrote the damn thing! (I 
had to check the function signature to remind me what they were.)

Given:

    Popen( ..., **textinfo, ...)
    Popen( ..., **{encoding, errors, text}, ...)

I think that the second one is clearly superior in respect to showing 
the parameter names directly in place where they are used, while the 
first is clearly superior for brevity and terseness.
...
And there are good reasons. Ideally you shouldn’t have any function 
calls that are so hairy that you want to refractor them, but the the 
existence of libraries you can’t control that are too huge and 
unwieldy is the entire rationale here.
I wouldn't say the entire rationale.

For example, I have come across this a lot:

    def public_function(meta, reverse, private, dunder):
        do some pre-processing
        result = _private_function(
                     meta=meta, reverse=reverse, 
                     private=private, dunder=dunder
                     )
        do some post-processing
        return result

Changing the signature of `public_function` to take only a kwargs is not 
an option (for reasons I hope are obvious, but if not I'm happy to 
explain). Writing it like this instead just moves the pain without 
eliminating it:

        d = dict(meta=meta, reverse=reverse, 
                 private=private, dunder=dunder
                 )
        result = _private_function(**d)

So there's a genuine pain point here that regular keyword unpacking 
doesn't solve.

[...]
...
...
Backwards compatibility
-----------------------
The syntax is not currently legal so there are no backwards 
compatibility concerns.
The syntax is perfectly legal today.
Okay, I misspoke. Miswrote. It's not currently legal to use a set in 
dict unpacking.

I will re-iterate that this proposal does not construct a set. It just 
looks a bit like a set, in the same way that all of these have things 
which look like a bit like tuples but aren't:

    [a, b, c]
    func(a, b, c)
    import sys, os, collections
    except ValueError, TypeError, ImportError

and the same way that subscripting looks a bit like a list:

    mydict['key']  # Not actually a list ['key']

[...]
...
Running Python 3.9 code in 3.8 would do the wrong thing, but maybe not 
wrong enough to break your program visibly, which could lead to some 
fun debugging sessions. That’s not a dealbreaker, but it’s definitely 
better for new syntax to raise a syntax error in old versions, if 
possible.
I don't think so. You would get a TypeError.

    py> func(**{a, b, c})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: func() argument after ** must be a mapping, not set
...
And of course existing linters, IDEs, etc. will misunderstand the new 
syntax (which is worse than failing to parse it) until they’re taught 
the new special case.
Do you have any examples?
...
This also raises an implementation issue. The grammar rule to 
disambiguate this will probably either be pretty hairy, or require 
building a parallel fork of half the expression tree so you can have 
an “expression except for set displays” node. Or there won’t be one, 
and it’ll be done as a special case post-parse hack, which Python uses 
sparingly.
Obviously if the implementation is hairy enough, that counts against the 
proposal. But given that there is no realistic chance of this going into 
Python 3.9 (feature freeze is not far away), and Python 3.10 will be 
using the new PEG parser, let's not rule it out *just* yet.

-- 
Steven

[Python-ideas] Re: Proposal: Keyword Unpacking Shortcut [was Re: Keyword arguments self-assignment]

Steven D'Aprano