[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2 Dec 2021

      On Fri, Dec 3, 2021 at 9:26 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
...
On 2021-12-02 00:31, Chris Angelico wrote:
...
Here's how a ternary if looks:
...
...
...
...
>>def f(n):
...     return 0 if n == 0 else 42/n
...
>>dis.dis(f)
   2           0 LOAD_FAST                0 (n)
               2 LOAD_CONST               1 (0)
               4 COMPARE_OP               2 (==)
               6 POP_JUMP_IF_FALSE        6 (to 12)
               8 LOAD_CONST               1 (0)
              10 RETURN_VALUE
         >>   12 LOAD_CONST               2 (42)
              14 LOAD_FAST                0 (n)
              16 BINARY_TRUE_DIVIDE
              18 RETURN_VALUE
The "42/n" part is stored in f.__code__.co_code as the part that says
"LOAD_CONST 42, LOAD_FAST n, BINARY_TRUE_DIVIDE". It's not an object.
It's just code - three instructions.
Here's how (in the reference implementation - everything is subject to
change) a late-bound default looks:
...
...
...
...
>>def f(x=>[]): print(x)
...
>>dis.dis(f)
   1           0 QUERY_FAST               0 (x)
               2 POP_JUMP_IF_TRUE         4 (to 8)
               4 BUILD_LIST               0
               6 STORE_FAST               0 (x)
         >>    8 LOAD_GLOBAL              0 (print)
              10 LOAD_FAST                0 (x)
              12 CALL_FUNCTION            1
              14 POP_TOP
              16 LOAD_CONST               0 (None)
              18 RETURN_VALUE
The "=>[]" part is stored in f.__code__.co_code as the part that says
"QUERY_FAST x, and if false, BUILD_LIST, STORE_FAST x". It's not an
object. It's four instructions in the bytecode.
In both cases, no part of the expression is ever re-executed. I'm not
understanding the distinction here. Can you explain further please?
Your explanation exactly shows how it IS re-executed.  I'm not totally
clear on this disassembly since this is new behavior, but if I
understand right, BUILD_LIST is re-executing the expression `[]` and
STORE_FAST is re-assigning it to x.  The expression `[]` is
syntactically present in the function definition but its execution has
been shoved into the function body where it may be re-executed many
times (any time the function is called without passing a value).
Ah, I think I get you. The problem is that code is in the def line but
is only executed when the function is called, is that correct? Because
the code would be "re-executed" just as much if it were written in the
function body. It's executed (at most) once for each call to the
function, just like the ternary's side is.

I suppose that's a consideration, but it's not nearly as strong in
practice as you might think. A lot of people aren't even aware of the
difference between compilation time and definition time (even people
on this list have made that mistake). Function default args are
executed when the function is defined, not when it's called, and
that's something that changes with this proposal; but there are many
other subtleties to execution order and timing that don't really
matter in practice.

Perhaps the key point here is to consider function decorators. We
could avoid them altogether:

def f():
    @deco
    def g(x): ...

    def h(x): ...
    h = deco(h)

But as well as having the name replication problem, this buries
important information down in the body of the surrounding code, rather
than putting it at the signature of g/h where it belongs. Even though,
semantically, this is actually part of the body of f, we want to be
able to read it as part of the signature of g. Logically and
conceptually, it is part of the signature. Now compare these two:

def f2():
    _SENTINEL = object()
    def g(x=_SENTINEL):
        if x is _SENTINEL: x = []
        ...

    def h(x=>[]):
        ...

Which one has its signature where its signature belongs? Yes,
semantically, the construction of the empty list happens at function
call time, not at definition time. But what you're saying is: if there
are no args passed, behave as if a new empty list was passed. That's
part of the signature.

In neither case will you find an object representing the expression []
in the function's signature, because that's not an object, it's an
instruction to build an empty list. In the case of g, you can find a
meaningless and useless object stashed away in __defaults__, but that
doesn't tell you anything about the true behaviour of the function. At
least in the case of h, you can find the descriptive string "[]"
stashed there, which can tell a human what's happening.

ChrisA