On Sat, Dec 4, 2021 at 8:48 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Dec 04, 2021 at 03:14:46PM +1100, Chris Angelico wrote:
Lots and lots and lots of potential problems. Consider:
def f(): a = 1 def f(b, x=>a+b): def g(): return x, a, b
Both a and b are closure variables - one because it comes from an outer scope, one because it's used in an inner scope. So to evaluate a+b, you have to look up an existing closure cell, AND construct a new closure cell.
The only way to do that is for the compiled code of a+b to exist entirely within the context of f's code object.
I dispute that is the only way. Let's do a thought experiment.
First, we add a new flag to the co_flags field on code objects. Call it the "LB" flag, for late-binding.
Second, we make this:
def f(b, x=>a+b): ...
syntactic sugar for this:
def f(b, x=lambda b: a+b): ...
except that the lambda has the LB flag set.
Okay. So the references to 'a' and 'b' here are one more level of function inside the actual function we're defining, which means you're paying the price of nonlocals just to be able to late-evaluate defaults. Not a deal-breaker, but that is a notable cost (every reference to them inside the function will be slower).
And third, when the interpreter fetches a default from func.__defaults__, if it is a LB function, it automatically calls that function with the parameters to the left of x (which in this case would be just b).
Plausible. Okay. What this does mean, though, is that there are "magic objects" that cannot be used like other objects. Consider: def make_printer(dflt): def func(x=dflt): print("x is", x) return func Will make_printer behave the same way for all objects? Clearly the expectation is that it will display the repr of whichever object is passed to func, or if none is, whichever object is passed to make_printer. But if you pass it a function with the magic LB flag set, it will *execute* that function. I don't like the idea that some objects will be invisibly different like that.
Here's your function, with a couple of returns to make it actually do something:
def f(): a = 1 def f(b, x=>a+b): def g(): return x, a, b return g return f
We can test that right now (well, almost all of it) with this:
def func(): # change of name to distinguish inner and outer f a = 1 def f(b, x=lambda b: a+b): def g(): return x, a, b return g return f
and just pretend that x is automatically evaluated by the interpreter. But as a proof of concept, it's enough that we can demonstrate that *we* can manually evaluate it, by calling the lambda.
Okay, sure. It's a bit hard to demo it (since it has to ONLY do that magic if the arg was omitted), but sure, we can pretend.
We can call func() to get the inner function f, and call f to get g:
>>> f = func() >>> print(f) <function func.<locals>.f at 0x7fc945c41f30>
>>> g = f(100) >>> print(g) <function func.<locals>.f.<locals>.g at 0x7fc945e1f520>
Calling g works:
>>> print(g()) (<function func.<locals>.<lambda> at 0x7fc945c40f70>, 1, 100)
with the understanding that the real implementation will have automatically called that lambda, so we would have got 101 instead of the lambda. That step requires interpreter support, so for now we just have to pretend that we get
(101, 1, 100)
instead of the lambda. But we can demonstrate that calling the lambda works, by manually calling it:
>>> x = g()[0] >>> print(x) <function func.<locals>.<lambda> at 0x7fc945c40f70> >>> print(x(100)) # the interpreter knows that b=100 101
Now let's see if we can extract the default and play around with it:
>>> default_expression = f.__defaults__[0] >>> print(default_expression) <function func.<locals>.<lambda> at 0x7fc945c40f70>
The default expression is just a function (with the new LB flag set). So we can inspect its name, its arguments, its cell variables, etc:
>>> default_expression.__closure__ (<cell at 0x7fc945de74f0: int object at 0x7fc94614c0f0>,)
We can do anything that we could do with any other other function object.
Yup. As long as it doesn't include any assignment expressions, or anything else that would behave differently.
Can we evaluate it? Of course we can. And we can test it with any value we like, we're not limited to the value of b that we originally passed to func().
>>> default_expression(3000) 3001
Of course, if we are in a state of *maximal ignorance* we might have no clue what information is needed to evaluate that default expression:
>>> default_expression() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: func.<locals>.<lambda>() missing 1 required positional argument: 'b'
Oh look, we get a useful diagnostic message for free!
What are we missing? The source code of the original expression, as text. That's pretty easy too: the compiler knows the source, it can cram it into the default expression object:
>>> default_expression.__expression__ = 'a+b'
Introspection tools like help() can learn to look for that.
What else are we missing? A cool repr.
>>> print(default_expression) # Simulated. <late bound default expression a+b>
We can probably come up with a better repr, and a better name than "late bound default expression". We already have other co_flags that change the repr:
32 GENERATOR 128 COROUTINE 256 ITERABLE_COROUTINE
so we need a name that is at least as cool as "generator" or "coroutine".
Those parts are trivial, no problem.
Summary of changes:
* add a new co_flag with a cool name better than "LB";
* add an `__expression__` dunder to hold the default expression; (possibly missing for regular functions -- we don't necessarily need *every* function to have this dunder)
* change the repr of LB functions to display the expression;
* teach the interpreter to compile late-bound defaults into one of these LB functions, including the source expression;
* teach the interpreter that when retrieving default values from the function's `__defaults__`, if they are a LB function, it must call the function and use its return result as the actual default value;
* update help() and other introspection tools to handle these LB functions; but if any tools don't get updated, you still get a useful result with an informative repr.
Great. So now we have some magnificently magical behaviour in the language, which will have some nice sharp edge cases, but which nobody will ever notice. Totally. I'm sure. Plus, we pay a performance price in any function that makes use of argument references, not just for the late-bound default, but in the rest of the code. We also need to have these special functions that get stored as separate code objects. All to buy what, exactly? The ability to manually synthesize an equivalent parameter value, as long as there's no assignment expressions, no mutation, no other interactions, etc, etc, etc? That's an awful lot of magic for not a lot of benefit. I *really* don't like the idea that some types of object will be executed instead of being used, just because they have a flag set. That strikes me as the sort of thing that should be incredibly scary, but since I can't think of any specific reasons, I just have to call it "extremely off-putting". But hey. Go ahead and build a reference implementation. I'll compile it and give it a whirl. ChrisA