[Python-ideas] proto-PEP: Fixing Non-constant Default Arguments
Jan Kanis
jan.kanis at phil.uu.nl
Mon Jan 29 17:03:52 CET 2007
Well, I (obviously) like the idea, but your pep misses some important
points (and some not so important).
The important one is about the scoping of variables in default
expressions. The pep says nothing about them.
If I read the pep correctly, any variables in default expressions are
handled the same as variables in the body of a function. This means the
compiler decides if they should be local, lexical or global. If there is
an assignment to a variable, the compiler makes it a local, else it finds
the right enclosing scope (lexical or global). In the current python, this
works fine:
>>> a = 123
>>> def foo(b=a):
a = 2
print a, b
>>> foo()
2 123
>>> a = 42
>>> foo()
2 123
In the pep, the a in the default expression would be handled just like any
other a in the function body, which means it wil become a _local_
variable. Calling the function would then result in an UnboundLocalError.
Just like this in current python:
>>> a = 123
>>> def foo(b=None):
b = a if b==None else b
a = 2
print a
>>> foo()
Traceback (most recent call last):
File "<pyshell#22>", line 1, in <module>
foo()
File "<pyshell#21>", line 2, in foo
b = a if b==None else b
UnboundLocalError: local variable 'a' referenced before assignment
The solution, I think, as I wrote in my previous messages, is to have the
compiler explicitly make variables in default expressions lexical or
global variables.
This would still break the foo() in my example above, because you can't
assign to a lexical variable, or to a global that isn't declared global.
Therefore I think the compiler should distinguish between the a in the
default expression and the a in the function body, and treat them as two
different variables. You can think about it as if the compiler silently
renames one of them (without this rename being visible to python code.
AFAIK the bytecode is stack based and closure vars are put in some kind of
anonymous cell, which means they both don't actually have a name anyway.)
see below for more comments regarding this and other things
On Sun, 28 Jan 2007 20:22:44 +0100, Chris Rebert <cvrebert at gmail.com>
wrote:
> The following is a proto-PEP based on the discussion in the thread
> "fixing mutable default argument values". Comments would be greatly
> appreciated.
> - Chris Rebert
>
> Title: Fixing Non-constant Default Arguments
>
> Abstract
>
> This PEP proposes new semantics for default arguments to remove
> boilerplate code associated with non-constant default argument values,
> allowing them to be expressed more clearly and succinctly.
>
>
> Motivation
>
> Currently, to write functions using non-constant default arguments,
> one must use the idiom:
>
> def foo(non_const=None):
> if non_const is None:
> non_const = some_expr
> #rest of function
>
> or equivalent code. Naive programmers desiring mutable default arguments
> often make the mistake of writing the following:
>
> def foo(mutable=some_expr_producing_mutable):
> #rest of function
>
> However, this does not work as intended, as
> 'some_expr_producing_mutable' is evaluated only *once* at
> definition-time, rather than once per call at call-time. This results
> in all calls to 'foo' using the same default value, which can result in
> unintended consequences. This necessitates the previously mentioned
> idiom. This unintuitive behavior is such a frequent stumbling block for
> newbies that it is present in at least 3 lists of Python's problems [0]
> [1] [2].
Also, I just found out that python's own documentation refers to this with
an "Important warning: The default value is evaluated only once. This
makes a difference when the default is a mutable object such as a list,
dictionary, or instances of most classes. ..."
(http://docs.python.org/tut/node6.html#SECTION006710000000000000000)
this indicates imo that also the python doc writers don't think of the
current situation as optimal.
> There are currently few, if any, known good uses of the current
> behavior of mutable default arguments. The most common one is to
> preserve function state between calls. However, as one of the lists [2]
> comments, this purpose is much better served by decorators, classes, or
> (though less preferred) global variables.
> Therefore, since the current semantics aren't useful for
> non-constant default values and an idiom is necessary to work around
> this deficiency, why not change the semantics so that people can write
> what they mean more directly, without the annoying boilerplate?
>
>
> Rationale
>
> Originally, it was proposed that all default argument values be
> deep-copied from the original (evaluated at definition-time) at each
> invocation of the function where the default value was required.
> However, this doesn't take into account default values that are not
> literals, e.g. function calls, subscripts, attribute accesses. Thus,
> the new idea was to re-evaluate the default arguments at each call where
> they were needed. There was some concern over the possible performance
> hit this could cause, and whether there should be new syntax so that
> code could use the existing semantics for performance reasons. Some of
> the proposed syntaxes were:
>
> def foo(bar=<baz>):
> #code
>
> def foo(bar=new baz):
> #code
>
> def foo(bar=fresh baz):
> #code
>
> def foo(bar=separate baz):
> #code
>
> def foo(bar=another baz):
> #code
>
> def foo(bar=unique baz):
> #code
>
> where the new keyword (or angle brackets) would indicate that the
> parameter's default argument should use the new semantics. Other
> parameters would continue to use the old semantics. It was generally
> agreed that the angle-bracket syntax was particularly ugly, leading to
> the proposal of the other syntaxes. However, having 2 different sets of
> semantics could be confusing and leaving in the old semantics just for
> performance might be premature optimization. Refactorings to deal with
> the possible performance hit are discussed below.
>
>
> Specification
>
> The current semantics for default arguments are replaced by the
> following semantics:
> - Whenever a function is called, and the caller does not provide a
> value for a parameter with a default expression, the parameter's
> default expression shall be evaluated in the function's scope. The
> resulting value shall be assigned to a local variable in the
> function's scope with the same name as the parameter.
Include something saying that any variables in a default expression shall
be lexical variables, with their scope being the first outer scope that
defines a variable with the same name (they should just use the same rules
as other lexical/closure variables), and that if the function body defines
a local variable with the same name as a variable in a default expression,
those variables shall be handled as two separate variables.
> - The default argument expressions shall be evaluated before the
> body of the function.
> - The evaluation of default argument expressions shall proceed in
> the same order as that of the parameter list in the function's
> definition.
> Given these semantics, it makes more sense to refer to default argument
> expressions rather than default argument values, as the expression is
> re-evaluated at each call, rather than just once at definition-time.
> Therefore, we shall do so hereafter.
>
> Demonstrative examples of new semantics:
> #default argument expressions can refer to
> #variables in the enclosing scope...
> CONST = "hi"
> def foo(a=CONST):
> print a
>
> >>> foo()
> hi
> >>> CONST="bye"
> >>> foo()
> bye
>
> #...or even other arguments
> def ncopies(container, n=len(container)):
> return [container for i in range(n)]
>
> >>> ncopies([1, 2], 5)
> [[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]
> >>> ncopies([1, 2, 3])
> [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
> >>> #ncopies grabbed n from [1, 2, 3]'s length (3)
I'm not sure if this can be combined elegantly with what I said about
variables being lexical variables. The first argument to ncopies,
'container', is clearly a local variable to ncopies. The 'container' in
the second arg default expr should, if my comments above are accepted, be
a lexical variable referring to the 'container' in the global scope.
The best way to combine the two features seems to be to let 'container' be
a local var if any of the preceding args is named 'container', and let it
be a lexically scoped variable otherwise. However, I'm not convinced this
complexity is worth it and the vars in default expressions shouldn't just
always be lexical vars.
>
> #default argument expressions are arbitrary expressions
> def my_sum(lst):
> cur_sum = lst[0]
> for i in lst[1:]: cur_sum += i
> return cur_sum
>
> def bar(b=my_sum((["b"] * (2 * 3))[:4])):
> print b
>
> >>> bar()
> bbbb
>
> #default argument expressions are re-evaluated at every call...
> from random import randint
> def baz(c=randint(1,3)):
> print c
>
> >>> baz()
> 2
> >>> baz()
> 3
>
> #...but only when they're required
> def silly():
> print "spam"
> return 42
>
> def qux(d=silly()):
> pass
>
> >>> qux()
> spam
> >>> qux(17)
> >>> qux(d=17)
> >>> qux(*[17])
> >>> qux(**{'d':17})
> >>> #no output because silly() never called because d's value was
> specified in the calls
>
> #Rule 3
> count = 0
> def next():
> global count
> count += 1
> return count - 1
>
> def frobnicate(g=next(), h=next(), i=next()):
> print g, h, i
>
> >>> frobnicate()
> 0 1 2
> >>> #g, h, and i's default argument expressions are evaluated in
> the same order as the parameter definition
>
>
> Backwards Compatibility
>
> This change in semantics breaks all code which uses mutable default
> argument values. Such code can be refactored from:
Wow, let's not scare everyone away just yet. This should read:
"This change in semantics breaks code which uses mutable default argument
expressions and depends on those expressions being evaluated only once, or
code that assigns new incompatible values in a parent scope to variables
used in default expressions"
>
> def foo(bar=mutable):
> #code
if 'mutable' is just a single variable, this isn't gonna break, unless the
global scope decides to do something like this:
def foo(bar=mutable):
#code
mutable = incompatible_mutable
# ...
foo()
>
> to
>
> def stateify(state):
> def _wrap(func):
> def _wrapper(*args, **kwds):
> kwds['bar'] = state
> return func(*args, **kwds)
> return _wrapper
> return _wrap
>
> @stateify(mutable)
> def foo(bar):
> #code
>
> or
>
> state = mutable
> def foo(bar=state):
> #code
>
> or
>
> class Baz(object):
> def __init__(self):
> self.state = mutable
>
> def foo(self, bar=self.state):
> #code
Minor point: the stateify decorator looks a bit scary to me as it uses
three levels of nested functions. (that's inherent to decorators, but
still.) Suggest you name the class and global var solutions first, and the
decorator as last, just to prevent people from stopping reading the pep
and voting '-1' right when they hit the decorator solution.
>
> The changes in this PEP are backwards-compatible with all code whose
> default argument values are immutable
...or don't depend on being evaluated only once, and don't modify in a
parent scope the variables in default expressions in an incompatible way,
(hmm, the 'or' and 'and' may need some disambiguation parentheses...)
> including code using the idiom
> mentioned in the 'Motivation' section. However, such values will now be
> recomputed for each call for which they are required. This may cause
> performance degradation. If such recomputation is significantly
> expensive, the same refactorings mentioned above can be used.
>
> In relation to Python 3.0, this PEP's proposal is compatible with
> those of PEP 3102 [3] and PEP 3107 [4]. Also, this PEP does not depend
> on the acceptance of either of those PEPs.
>
>
> Reference Implementation
>
> All code of the form:
>
> def foo(bar=some_expr, baz=other_expr):
> #body
>
> Should act as if it had read (in pseudo-Python):
>
> def foo(bar=_undefined, baz=_undefined):
> if bar is _undefined:
> bar = some_expr
> if baz is _undefined:
> baz = other_expr
> #body
and, if there are any variables occuring in the function body and in
some_expr or other_expr, rename those in the function body to something
that doesn't name-clash.
>
> where _undefined is the value given to a parameter when the caller
> didn't specify a value for it. This is not intended to be a literal
> translation, but rather a demonstration as to how Python's internal
> argument-handling machinery should be changed.
>
>
> References
>
> [0] 10 Python pitfalls
> http://zephyrfalcon.org/labs/python_pitfalls.html
>
> [1] Python Gotchas
> http://www.ferg.org/projects/python_gotchas.html#contents_item_6
>
> [2] When Pythons Attack
> http://www.onlamp.com/pub/a/python/2004/02/05/learn_python.html?page=2
>
> [3] Keyword-Only Arguments
> http://www.python.org/dev/peps/pep-3102/
>
> [4] Function Annotations
> http://www.python.org/dev/peps/pep-3107/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
I'm not quite sure if your decision not to include the
default-expr-vars-become-lexical part was intentional or not. If it is,
can you tell me why you'd want that? Else you can incorporate my comments
in the pep.
Another minor point: I'm personally not too fond of too many 'shall's
close together. It makes me think of lots of bureaucracy and
design-by-committee. I think several peps don't have a shall in them, but
maybe it is the right language for this one. What's the right language to
use in peps?
Well, that's about it I think.
- Jan
More information about the Python-ideas
mailing list