Well, I (obviously) like the idea, but your pep misses some important points (and some not so important). The important one is about the scoping of variables in default expressions. The pep says nothing about them. If I read the pep correctly, any variables in default expressions are handled the same as variables in the body of a function. This means the compiler decides if they should be local, lexical or global. If there is an assignment to a variable, the compiler makes it a local, else it finds the right enclosing scope (lexical or global). In the current python, this works fine:
a = 123 def foo(b=a): a = 2 print a, b
foo() 2 123 a = 42 foo() 2 123
In the pep, the a in the default expression would be handled just like any other a in the function body, which means it wil become a _local_ variable. Calling the function would then result in an UnboundLocalError. Just like this in current python:
a = 123 def foo(b=None): b = a if b==None else b a = 2 print a
foo()
Traceback (most recent call last):
File "
The following is a proto-PEP based on the discussion in the thread "fixing mutable default argument values". Comments would be greatly appreciated. - Chris Rebert
Title: Fixing Non-constant Default Arguments
Abstract
This PEP proposes new semantics for default arguments to remove boilerplate code associated with non-constant default argument values, allowing them to be expressed more clearly and succinctly.
Motivation
Currently, to write functions using non-constant default arguments, one must use the idiom:
def foo(non_const=None): if non_const is None: non_const = some_expr #rest of function
or equivalent code. Naive programmers desiring mutable default arguments often make the mistake of writing the following:
def foo(mutable=some_expr_producing_mutable): #rest of function
However, this does not work as intended, as 'some_expr_producing_mutable' is evaluated only *once* at definition-time, rather than once per call at call-time. This results in all calls to 'foo' using the same default value, which can result in unintended consequences. This necessitates the previously mentioned idiom. This unintuitive behavior is such a frequent stumbling block for newbies that it is present in at least 3 lists of Python's problems [0] [1] [2].
Also, I just found out that python's own documentation refers to this with an "Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. ..." (http://docs.python.org/tut/node6.html#SECTION006710000000000000000) this indicates imo that also the python doc writers don't think of the current situation as optimal.
There are currently few, if any, known good uses of the current behavior of mutable default arguments. The most common one is to preserve function state between calls. However, as one of the lists [2] comments, this purpose is much better served by decorators, classes, or (though less preferred) global variables. Therefore, since the current semantics aren't useful for non-constant default values and an idiom is necessary to work around this deficiency, why not change the semantics so that people can write what they mean more directly, without the annoying boilerplate?
Rationale
Originally, it was proposed that all default argument values be deep-copied from the original (evaluated at definition-time) at each invocation of the function where the default value was required. However, this doesn't take into account default values that are not literals, e.g. function calls, subscripts, attribute accesses. Thus, the new idea was to re-evaluate the default arguments at each call where they were needed. There was some concern over the possible performance hit this could cause, and whether there should be new syntax so that code could use the existing semantics for performance reasons. Some of the proposed syntaxes were:
def foo(bar=<baz>): #code
def foo(bar=new baz): #code
def foo(bar=fresh baz): #code
def foo(bar=separate baz): #code
def foo(bar=another baz): #code
def foo(bar=unique baz): #code
where the new keyword (or angle brackets) would indicate that the parameter's default argument should use the new semantics. Other parameters would continue to use the old semantics. It was generally agreed that the angle-bracket syntax was particularly ugly, leading to the proposal of the other syntaxes. However, having 2 different sets of semantics could be confusing and leaving in the old semantics just for performance might be premature optimization. Refactorings to deal with the possible performance hit are discussed below.
Specification
The current semantics for default arguments are replaced by the following semantics: - Whenever a function is called, and the caller does not provide a value for a parameter with a default expression, the parameter's default expression shall be evaluated in the function's scope. The resulting value shall be assigned to a local variable in the function's scope with the same name as the parameter.
Include something saying that any variables in a default expression shall be lexical variables, with their scope being the first outer scope that defines a variable with the same name (they should just use the same rules as other lexical/closure variables), and that if the function body defines a local variable with the same name as a variable in a default expression, those variables shall be handled as two separate variables.
- The default argument expressions shall be evaluated before the body of the function. - The evaluation of default argument expressions shall proceed in the same order as that of the parameter list in the function's definition. Given these semantics, it makes more sense to refer to default argument expressions rather than default argument values, as the expression is re-evaluated at each call, rather than just once at definition-time. Therefore, we shall do so hereafter.
Demonstrative examples of new semantics: #default argument expressions can refer to #variables in the enclosing scope... CONST = "hi" def foo(a=CONST): print a
>>> foo() hi >>> CONST="bye" >>> foo() bye
#...or even other arguments def ncopies(container, n=len(container)): return [container for i in range(n)]
>>> ncopies([1, 2], 5) [[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]] >>> ncopies([1, 2, 3]) [[1, 2, 3], [1, 2, 3], [1, 2, 3]] >>> #ncopies grabbed n from [1, 2, 3]'s length (3)
I'm not sure if this can be combined elegantly with what I said about variables being lexical variables. The first argument to ncopies, 'container', is clearly a local variable to ncopies. The 'container' in the second arg default expr should, if my comments above are accepted, be a lexical variable referring to the 'container' in the global scope. The best way to combine the two features seems to be to let 'container' be a local var if any of the preceding args is named 'container', and let it be a lexically scoped variable otherwise. However, I'm not convinced this complexity is worth it and the vars in default expressions shouldn't just always be lexical vars.
#default argument expressions are arbitrary expressions def my_sum(lst): cur_sum = lst[0] for i in lst[1:]: cur_sum += i return cur_sum
def bar(b=my_sum((["b"] * (2 * 3))[:4])): print b
>>> bar() bbbb
#default argument expressions are re-evaluated at every call... from random import randint def baz(c=randint(1,3)): print c
>>> baz() 2 >>> baz() 3
#...but only when they're required def silly(): print "spam" return 42
def qux(d=silly()): pass
>>> qux() spam >>> qux(17) >>> qux(d=17) >>> qux(*[17]) >>> qux(**{'d':17}) >>> #no output because silly() never called because d's value was specified in the calls
#Rule 3 count = 0 def next(): global count count += 1 return count - 1
def frobnicate(g=next(), h=next(), i=next()): print g, h, i
>>> frobnicate() 0 1 2 >>> #g, h, and i's default argument expressions are evaluated in the same order as the parameter definition
Backwards Compatibility
This change in semantics breaks all code which uses mutable default argument values. Such code can be refactored from:
Wow, let's not scare everyone away just yet. This should read: "This change in semantics breaks code which uses mutable default argument expressions and depends on those expressions being evaluated only once, or code that assigns new incompatible values in a parent scope to variables used in default expressions"
def foo(bar=mutable): #code
if 'mutable' is just a single variable, this isn't gonna break, unless the global scope decides to do something like this: def foo(bar=mutable): #code mutable = incompatible_mutable # ... foo()
to
def stateify(state): def _wrap(func): def _wrapper(*args, **kwds): kwds['bar'] = state return func(*args, **kwds) return _wrapper return _wrap
@stateify(mutable) def foo(bar): #code
or
state = mutable def foo(bar=state): #code
or
class Baz(object): def __init__(self): self.state = mutable
def foo(self, bar=self.state): #code
Minor point: the stateify decorator looks a bit scary to me as it uses three levels of nested functions. (that's inherent to decorators, but still.) Suggest you name the class and global var solutions first, and the decorator as last, just to prevent people from stopping reading the pep and voting '-1' right when they hit the decorator solution.
The changes in this PEP are backwards-compatible with all code whose default argument values are immutable
...or don't depend on being evaluated only once, and don't modify in a parent scope the variables in default expressions in an incompatible way, (hmm, the 'or' and 'and' may need some disambiguation parentheses...)
including code using the idiom mentioned in the 'Motivation' section. However, such values will now be recomputed for each call for which they are required. This may cause performance degradation. If such recomputation is significantly expensive, the same refactorings mentioned above can be used.
In relation to Python 3.0, this PEP's proposal is compatible with those of PEP 3102 [3] and PEP 3107 [4]. Also, this PEP does not depend on the acceptance of either of those PEPs.
Reference Implementation
All code of the form:
def foo(bar=some_expr, baz=other_expr): #body
Should act as if it had read (in pseudo-Python):
def foo(bar=_undefined, baz=_undefined): if bar is _undefined: bar = some_expr if baz is _undefined: baz = other_expr #body
and, if there are any variables occuring in the function body and in some_expr or other_expr, rename those in the function body to something that doesn't name-clash.
where _undefined is the value given to a parameter when the caller didn't specify a value for it. This is not intended to be a literal translation, but rather a demonstration as to how Python's internal argument-handling machinery should be changed.
References
[0] 10 Python pitfalls http://zephyrfalcon.org/labs/python_pitfalls.html
[1] Python Gotchas http://www.ferg.org/projects/python_gotchas.html#contents_item_6
[2] When Pythons Attack http://www.onlamp.com/pub/a/python/2004/02/05/learn_python.html?page=2
[3] Keyword-Only Arguments http://www.python.org/dev/peps/pep-3102/
[4] Function Annotations http://www.python.org/dev/peps/pep-3107/ _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
I'm not quite sure if your decision not to include the default-expr-vars-become-lexical part was intentional or not. If it is, can you tell me why you'd want that? Else you can incorporate my comments in the pep. Another minor point: I'm personally not too fond of too many 'shall's close together. It makes me think of lots of bureaucracy and design-by-committee. I think several peps don't have a shall in them, but maybe it is the right language for this one. What's the right language to use in peps? Well, that's about it I think. - Jan