[Python-ideas] Keyword only argument on function call

Steven D'Aprano steve at pearwood.info
Wed Sep 12 08:17:49 EDT 2018


On Tue, Sep 11, 2018 at 04:57:16PM +0100, Jonathan Fine wrote:

> Summary: locals() and suggestion __params__ are similar, and roughly
> speaking each can be implemented from the other.

You cannot get a snapshot of the current locals just from the function 
parameters, since the current locals will include variables which aren't 
parameters.

Likewise you cannot get references to the original function parameters 
from the current local variables, since the params may have been 
re-bound since the call was made.

(Unless you can guarantee that locals() is immediately called before any 
new local variables were created, i.e. on entry to the function, before 
any other code can run. As you point out further below.)

There's a similarity only in the sense that parameters of a function are 
included as local variables, but the semantics of __params__ as proposed 
and locals() are quite different. They might even share some parts of 
implementation, but I don't think that really matters one way or 
another. Whether they do or don't is a mere implementation detail.


> Experts / pedants 
> would prefer not to use the name __params__ for this purpose.

I consider myself a pedant (and on a good day I might pass as something 
close to an expert on some limited parts of Python) and I don't have any 
objection to the *name* __params__. From the perspective of *inside* a 
function, it is a matter of personal taste whether you refer to 
parameter or argument:

    def func(a):  # in the declaration, "a" is a parameter
        # inside the running function, once "a" has a value set,
        # its a matter of taste whether you call it a parameter
        # or an argument or both; I suppose it depends on whether
        # you are referring to the *variable* or its *value*

    # but here 1 is the argument bound to the parameter "a"
    result = func(1)  

It is the semantics that I think are problematic, not the choice of 
name.


> Steve D'Aprano wrote:
> > Its also going to suffer from race conditions, unless someone much
> > cleverer than me can think of a way to avoid them which doesn't slow
> > down function calls even more.
> 
> As far as I know, locals() does not suffer from a race condition. But
> it's not a local variable. Rather, it's a function that returns a
> dict. Hence avoiding the race condition.

Indeed. Each time you call locals(), it returns a new dict with a 
snapshot of the current local namespace. Because it all happens inside 
the same function call, no external thread can poke inside your current 
call to mess with your local variables.

But that's different from setting function.__params__ to passed in 
arguments. By definition, each external caller is passing in its own set 
of arguments. If you have three calls to the function:

    function(a=1, b=2)  # called by A
    function(a=5, b=8)  # called by B
    function(a=3, b=4)  # called by C

In single-threaded code, there's no problem here: 

    A makes the first call;
    the interpreter sets function.__params__ to A's arguments;
    the function runs with A's arguments and returns;

    only then can B make its call;
    the interpreter sets function.__params__ to B's arguments;
    the function runs with B's arguments and returns;

    only then can C make its call;
    the interpreter sets function.__params__ to C's arguments;
    the function runs with C's arguments and returns


but in multi-threaded code, unless there's some form of locking, the 
three sets can interleave in any unpredictable order, e.g.:

    A makes its call;
    B makes its call;
    the interpreter sets function.__params__ to B's arguments;
    the interpreter sets function.__params__ to A's arguments;
    the function runs with B's arguments and returns;
    C make its call;
    the interpreter sets function.__params__ to C's arguments;
    the function runs with A's arguments and returns;
    the function runs with C's arguments and returns.


We could solve this race condition with locking, or by making the pair 
of steps:

    the interpreter sets function.__params__
    the function runs and returns

a single atomic step. But that introduces a deadlock: once A calls 
function(), threads B and C will pause (potentially for a very long 
time) waiting for A's call to complete, before they can call the same 
function.

I'm not an expert on threaded code, so it is possible I've missed some 
non-obvious fix for this, but I expect not. In general, solving race 
conditions without deadlocks is a hard problem.


> Python has some keyword identifiers. Here's one
> 
>     >>> __debug__ = 1
>     SyntaxError: assignment to keyword
> 
> 
> Notice that this is a SYNTAX error.  If __params__ were similarly a
> keyword identifier, then it would avoid the race condition.

The problem isn't because the caller assigns to __params__ manually. At 
no stage does Python code need to try setting "__params__ = x", in fact 
that ought to be quite safe because it would only be a local variable.

The race condition problem comes from trying to set function.__params__ 
on each call, even if its the interpreter doing the setting.


> It would
> simply be a handle that allows, for example, key-value access to the
> state of the frame on the execution stack. In other words, a
> lower-level object from which locals() could be built.

That wouldn't have the proposed semantics. __params__ is supposed to be 
a dict showing the initial values of the arguments passed in to the 
function, not merely a reference to the current frame.


[...]
> In my opinion, the technically well-informed would prefer something
> like __args__ or __locals__ instead of __params__, for the current
> purpose.

Oh well, that puts me in my place :-)

I have no objection to __args__, but __locals__ would be very 
inappropriate, as locals refers to *all* the local variables, not just 
those which are declared as parameters.

(Parameters are a *subset* of locals.)


> Finally, __params__ would simply be the value of __locals__ before any
> assignment has been done.

Indeed.

As Chris (I think it was) pointed out, we could reduce the cost of this 
with a bit of compiler magic. A function that never refers to __params__ 
would run just as it does today:

def func(a):
    print(a)


might look something like this:

  2           0 LOAD_GLOBAL              0 (print)
              2 LOAD_FAST                0 (a)
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE


just as it does now. But if the compiler sees a reference to __params__ 
in the body, it could compile in special code like this:

def func(a):
    print(a, __params__)


  2           0 LOAD_GLOBAL              0 (locals)
              2 CALL_FUNCTION            0
              4 STORE_FAST               1 (__params__)

  3           6 LOAD_GLOBAL              1 (print)
              8 LOAD_FAST                0 (a)
             10 LOAD_FAST                1 (__params__)
             12 CALL_FUNCTION            2
             14 POP_TOP
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE


Although more likely we'd want a special op-code to populate 
__params__, rather than calling the built-in locals() function.

I don't think that's a bad idea, but it does add more compiler magic, 
and I'm not sure that there is sufficient justification for it.



-- 
Steve


More information about the Python-ideas mailing list