[Python-ideas] Keyword only argument on function call
Steven D'Aprano
steve at pearwood.info
Wed Sep 12 08:17:49 EDT 2018
On Tue, Sep 11, 2018 at 04:57:16PM +0100, Jonathan Fine wrote:
> Summary: locals() and suggestion __params__ are similar, and roughly
> speaking each can be implemented from the other.
You cannot get a snapshot of the current locals just from the function
parameters, since the current locals will include variables which aren't
parameters.
Likewise you cannot get references to the original function parameters
from the current local variables, since the params may have been
re-bound since the call was made.
(Unless you can guarantee that locals() is immediately called before any
new local variables were created, i.e. on entry to the function, before
any other code can run. As you point out further below.)
There's a similarity only in the sense that parameters of a function are
included as local variables, but the semantics of __params__ as proposed
and locals() are quite different. They might even share some parts of
implementation, but I don't think that really matters one way or
another. Whether they do or don't is a mere implementation detail.
> Experts / pedants
> would prefer not to use the name __params__ for this purpose.
I consider myself a pedant (and on a good day I might pass as something
close to an expert on some limited parts of Python) and I don't have any
objection to the *name* __params__. From the perspective of *inside* a
function, it is a matter of personal taste whether you refer to
parameter or argument:
def func(a): # in the declaration, "a" is a parameter
# inside the running function, once "a" has a value set,
# its a matter of taste whether you call it a parameter
# or an argument or both; I suppose it depends on whether
# you are referring to the *variable* or its *value*
# but here 1 is the argument bound to the parameter "a"
result = func(1)
It is the semantics that I think are problematic, not the choice of
name.
> Steve D'Aprano wrote:
> > Its also going to suffer from race conditions, unless someone much
> > cleverer than me can think of a way to avoid them which doesn't slow
> > down function calls even more.
>
> As far as I know, locals() does not suffer from a race condition. But
> it's not a local variable. Rather, it's a function that returns a
> dict. Hence avoiding the race condition.
Indeed. Each time you call locals(), it returns a new dict with a
snapshot of the current local namespace. Because it all happens inside
the same function call, no external thread can poke inside your current
call to mess with your local variables.
But that's different from setting function.__params__ to passed in
arguments. By definition, each external caller is passing in its own set
of arguments. If you have three calls to the function:
function(a=1, b=2) # called by A
function(a=5, b=8) # called by B
function(a=3, b=4) # called by C
In single-threaded code, there's no problem here:
A makes the first call;
the interpreter sets function.__params__ to A's arguments;
the function runs with A's arguments and returns;
only then can B make its call;
the interpreter sets function.__params__ to B's arguments;
the function runs with B's arguments and returns;
only then can C make its call;
the interpreter sets function.__params__ to C's arguments;
the function runs with C's arguments and returns
but in multi-threaded code, unless there's some form of locking, the
three sets can interleave in any unpredictable order, e.g.:
A makes its call;
B makes its call;
the interpreter sets function.__params__ to B's arguments;
the interpreter sets function.__params__ to A's arguments;
the function runs with B's arguments and returns;
C make its call;
the interpreter sets function.__params__ to C's arguments;
the function runs with A's arguments and returns;
the function runs with C's arguments and returns.
We could solve this race condition with locking, or by making the pair
of steps:
the interpreter sets function.__params__
the function runs and returns
a single atomic step. But that introduces a deadlock: once A calls
function(), threads B and C will pause (potentially for a very long
time) waiting for A's call to complete, before they can call the same
function.
I'm not an expert on threaded code, so it is possible I've missed some
non-obvious fix for this, but I expect not. In general, solving race
conditions without deadlocks is a hard problem.
> Python has some keyword identifiers. Here's one
>
> >>> __debug__ = 1
> SyntaxError: assignment to keyword
>
>
> Notice that this is a SYNTAX error. If __params__ were similarly a
> keyword identifier, then it would avoid the race condition.
The problem isn't because the caller assigns to __params__ manually. At
no stage does Python code need to try setting "__params__ = x", in fact
that ought to be quite safe because it would only be a local variable.
The race condition problem comes from trying to set function.__params__
on each call, even if its the interpreter doing the setting.
> It would
> simply be a handle that allows, for example, key-value access to the
> state of the frame on the execution stack. In other words, a
> lower-level object from which locals() could be built.
That wouldn't have the proposed semantics. __params__ is supposed to be
a dict showing the initial values of the arguments passed in to the
function, not merely a reference to the current frame.
[...]
> In my opinion, the technically well-informed would prefer something
> like __args__ or __locals__ instead of __params__, for the current
> purpose.
Oh well, that puts me in my place :-)
I have no objection to __args__, but __locals__ would be very
inappropriate, as locals refers to *all* the local variables, not just
those which are declared as parameters.
(Parameters are a *subset* of locals.)
> Finally, __params__ would simply be the value of __locals__ before any
> assignment has been done.
Indeed.
As Chris (I think it was) pointed out, we could reduce the cost of this
with a bit of compiler magic. A function that never refers to __params__
would run just as it does today:
def func(a):
print(a)
might look something like this:
2 0 LOAD_GLOBAL 0 (print)
2 LOAD_FAST 0 (a)
4 CALL_FUNCTION 1
6 POP_TOP
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
just as it does now. But if the compiler sees a reference to __params__
in the body, it could compile in special code like this:
def func(a):
print(a, __params__)
2 0 LOAD_GLOBAL 0 (locals)
2 CALL_FUNCTION 0
4 STORE_FAST 1 (__params__)
3 6 LOAD_GLOBAL 1 (print)
8 LOAD_FAST 0 (a)
10 LOAD_FAST 1 (__params__)
12 CALL_FUNCTION 2
14 POP_TOP
16 LOAD_CONST 0 (None)
18 RETURN_VALUE
Although more likely we'd want a special op-code to populate
__params__, rather than calling the built-in locals() function.
I don't think that's a bad idea, but it does add more compiler magic,
and I'm not sure that there is sufficient justification for it.
--
Steve
More information about the Python-ideas
mailing list