[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

4 Dec 2021

      On Sun, Dec 5, 2021 at 11:34 AM Steven D'Aprano <steve@pearwood.info> wrote:
...
On Sat, Dec 04, 2021 at 10:50:14PM +1100, Chris Angelico wrote:
...
...
syntactic sugar for this:
def f(b, x=lambda b: a+b): ...
except that the lambda has the LB flag set.
Okay. So the references to 'a' and 'b' here are one more level of
function inside the actual function we're defining, which means you're
paying the price of nonlocals just to be able to late-evaluate
defaults. Not a deal-breaker, but that is a notable cost (every
reference to them inside the function will be slower).
How much slower? By my tests:
- access to globals is 25% more expensive than access to locals;
- access to globals is 19% more expensive than nonlocals;
- and nonlocals are 6% more expensive than locals.
Or if you do the calculation the other way (the percentages don't match
because the denominators are different):
- locals are 20% faster than globals;
- and 5% faster than nonlocals;
- nonlocals are 16% faster than globals.
Premature optimization is the root of all evil.
We would be better off spending effort making nonlocals faster for
everyone than throwing out desirable features and a cleaner design just
to save 5% on a microbenchmark.
Fair, but the desirable feature can be achieved without this cost, and
IMO your design isn't cleaner than the one I'm already using, and 5%
is a lot for no benefit.
...
[...]
...
What this does mean, though, is that there are "magic objects" that
cannot be used like other objects.
NotImplemented says hello :-)
Good point. Still, I don't think we want more magic like that.
...
And if you still think that we should care, we can come up with a more
complex trigger condition:
- the parameter was flagged as using a late-default;
- AND the default is a LB function.
Problem solved. Now you can use LB functions as early-bound defaults,
and all it costs is to record and check a flag for each parameter. Is it
worth it? Dunno.
Uhh.... so..... the parameter has to be flagged AND the value has to
be flagged? My current proposal just flags the parameter. So I ask
again: what are you gaining by this change? You've come right back to
where you started, and added extra costs and requirements, all for....
what?
...
[...]
...
...
The default expression is just a function (with the new LB flag set). So
we can inspect its name, its arguments, its cell variables, etc:
>>> default_expression.__closure__
    (<cell at 0x7fc945de74f0: int object at 0x7fc94614c0f0>,)
We can do anything that we could do with any other other function object.
Yup. As long as it doesn't include any assignment expressions, or
anything else that would behave differently.
I don't get what you mean here. Functions with the walrus operator are
still just functions that we can introspect:
...
...
...
f = lambda a, b: (len(w:=str(a))+b)*w
f('spam', 2)
'spamspamspamspamspamspam'
f.__code__
<code object <lambda> at 0x7fc945e07c00, file "<stdin>", line 1>
What sort of "behave differently" do you think would prevent us from
introspecting the function object? "Differently" from what?
Wrapping it in a function means the walrus would assign in that
function's context, not the outer function. I think it'd be surprising
if this works:

def f(x=>(a:=1)+a): # default is 2

but this doesn't:

def g(x=>(a:=1), y=>a): # default is UnboundLocalError

It's not a showstopper, but it is most definitely surprising.

The obvious solution is to say that, in this context, a is a nonlocal.
But this raises a new problem: The function object, when created, MUST
know its context. A code object says "this is a nonlocal", and a
function object says "when I'm called, this is my context". Which
means you can't have a function object that gets called externally,
because it's the code, not the function, that is what you need here.
And that means it's not directly executable, but it needs a context.

So, once again, we come right back around to what I have already: code
that you can't lift out and call externally. The difference is that,
by your proposal, there's a lot more overhead, for the benefit of
maybe under some very specific circumstances being able to synthesize
the result.
...
...
We also need to
have these special functions that get stored as separate code objects.
That's not a cost, that's a feature. Seriously. We're doing that so that
we can introspect them individually, not just as the source string, but
as actual callable objects that can be:
- introspected;
- tested;
- monkey-patched and modified in place (to the degree that any function
  can be modified, which is not a lot);
- copied or replaced with a new function.
Testing is probably the big one. Test frameworks will soon develop a way
to let you write tests to confirm that your late bound defaults do what
you expect them to do.
That's trivial for `arg=[]` expressions, but for complex expressions in
complex functions, being able to isolate them for testing is a big plus.
I'm still not convinced that it's as useful as you say. Compare these
append-and-return functions:

def build1(value, lst=None):
    if lst is None: lst = []
    lst.append(value)
    return lst

_SENTINEL = object()
def build2(value, lst=_SENTINEL):
    if lst is _SENTINEL: lst = []
    lst.append(value)
    return lst

def hidden_sentinel():
    _SENTINEL = object()
    def build3(value, lst=_SENTINEL):
        if lst is _SENTINEL: lst = []
        lst.append(value)
        return lst
    return build3
build3 = hidden_sentinel()

def build4(value, *optional_lst):
    if len(optional_lst) > 1: raise TypeError("too many args")
    if not optional_lst: optional_lst = [[]]
    optional_lst[0].append(value)
    return optional_lst[0]

def build5(value, lst=>[]):
    lst.append(value)
    return lst

(Have I missed any other ways to do this?)

In which of them can you introspect the []? Three of them have an
externally-visible sentinel, but you can't usefully change it in any
way. You can look at it, and you'll see None or "<object object at
0xYourCapitalCity>", but that doesn't tell you that you'll get a new
empty list. In which of them do you have a thing you can call that
will generate an equivalent empty list?

In which can you monkey-patch or modify-in-place the []? Not a big
deal though, you admit that there wouldn't be much monkey-patching in
any form. But you get none whatsoever with other idioms.

In which of these can you separately test the []? Do any current test
frameworks have a way to let you write tests to make sure that these
do what you expect?

In which of these can you copy the [] into another context, or replace
the [] with something else?

Why are these such major problems for build5 if they're not for the other four?

ChrisA