On Sat, Dec 04, 2021 at 03:14:46PM +1100, Chris Angelico wrote:
Lots and lots and lots of potential problems. Consider:
def f(): a = 1 def f(b, x=>a+b): def g(): return x, a, b
Both a and b are closure variables - one because it comes from an outer scope, one because it's used in an inner scope. So to evaluate a+b, you have to look up an existing closure cell, AND construct a new closure cell.
The only way to do that is for the compiled code of a+b to exist entirely within the context of f's code object.
I dispute that is the only way. Let's do a thought experiment. First, we add a new flag to the co_flags field on code objects. Call it the "LB" flag, for late-binding. Second, we make this: def f(b, x=>a+b): ... syntactic sugar for this: def f(b, x=lambda b: a+b): ... except that the lambda has the LB flag set. And third, when the interpreter fetches a default from func.__defaults__, if it is a LB function, it automatically calls that function with the parameters to the left of x (which in this case would be just b). Here's your function, with a couple of returns to make it actually do something: def f(): a = 1 def f(b, x=>a+b): def g(): return x, a, b return g return f We can test that right now (well, almost all of it) with this: def func(): # change of name to distinguish inner and outer f a = 1 def f(b, x=lambda b: a+b): def g(): return x, a, b return g return f and just pretend that x is automatically evaluated by the interpreter. But as a proof of concept, it's enough that we can demonstrate that *we* can manually evaluate it, by calling the lambda. We can call func() to get the inner function f, and call f to get g: >>> f = func() >>> print(f) <function func.<locals>.f at 0x7fc945c41f30> >>> g = f(100) >>> print(g) <function func.<locals>.f.<locals>.g at 0x7fc945e1f520> Calling g works: >>> print(g()) (<function func.<locals>.<lambda> at 0x7fc945c40f70>, 1, 100) with the understanding that the real implementation will have automatically called that lambda, so we would have got 101 instead of the lambda. That step requires interpreter support, so for now we just have to pretend that we get (101, 1, 100) instead of the lambda. But we can demonstrate that calling the lambda works, by manually calling it: >>> x = g()[0] >>> print(x) <function func.<locals>.<lambda> at 0x7fc945c40f70> >>> print(x(100)) # the interpreter knows that b=100 101 Now let's see if we can extract the default and play around with it: >>> default_expression = f.__defaults__[0] >>> print(default_expression) <function func.<locals>.<lambda> at 0x7fc945c40f70> The default expression is just a function (with the new LB flag set). So we can inspect its name, its arguments, its cell variables, etc: >>> default_expression.__closure__ (<cell at 0x7fc945de74f0: int object at 0x7fc94614c0f0>,) We can do anything that we could do with any other other function object. Can we evaluate it? Of course we can. And we can test it with any value we like, we're not limited to the value of b that we originally passed to func(). >>> default_expression(3000) 3001 Of course, if we are in a state of *maximal ignorance* we might have no clue what information is needed to evaluate that default expression: >>> default_expression() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: func.<locals>.<lambda>() missing 1 required positional argument: 'b' Oh look, we get a useful diagnostic message for free! What are we missing? The source code of the original expression, as text. That's pretty easy too: the compiler knows the source, it can cram it into the default expression object: >>> default_expression.__expression__ = 'a+b' Introspection tools like help() can learn to look for that. What else are we missing? A cool repr. >>> print(default_expression) # Simulated. <late bound default expression a+b> We can probably come up with a better repr, and a better name than "late bound default expression". We already have other co_flags that change the repr: 32 GENERATOR 128 COROUTINE 256 ITERABLE_COROUTINE so we need a name that is at least as cool as "generator" or "coroutine". Summary of changes: * add a new co_flag with a cool name better than "LB"; * add an `__expression__` dunder to hold the default expression; (possibly missing for regular functions -- we don't necessarily need *every* function to have this dunder) * change the repr of LB functions to display the expression; * teach the interpreter to compile late-bound defaults into one of these LB functions, including the source expression; * teach the interpreter that when retrieving default values from the function's `__defaults__`, if they are a LB function, it must call the function and use its return result as the actual default value; * update help() and other introspection tools to handle these LB functions; but if any tools don't get updated, you still get a useful result with an informative repr. -- Steve