[Python-ideas] proto-PEP: Fixing Non-constant Default Arguments
Chris Rebert
cvrebert at gmail.com
Tue Jan 30 08:09:37 CET 2007
Wow, that's a lot to think about.
Yes, the exact nature of the variables in default arguments does need
clarification. I'll probably go with something like your proposal, but I
need to consider a few things, particularly the 'ncopies' situation.
I added a reference to the Python documentation you mentioned. Thanks!
>> def foo(bar=mutable):
>> #code
>
> if 'mutable' is just a single variable, this isn't gonna break, unless
> the global scope decides to do something like this:
>
> def foo(bar=mutable):
> #code
>
> mutable = incompatible_mutable
> # ...
> foo()
Actually, I was considering the case where mutable is a constant (e.g. a
list), in which case it *will* break since the expr 'mutable' will be
re-evaluated at every call, so modifying it won't have the same effect
on future calls that it used to.
I included your rephrasing under "Backwards Compatibility".
I shuffled around the refactorings as you suggested.
My decision not to include the default-expr-vars-become-lexical part was
completely unintentional, as you can guess from above.
I also reworded the sentences using 'shall'. I don't really get why that
matters, but what the heck.
Thanks for your useful comments and suggestions.
- Chris Rebert
Jan Kanis wrote:
> Well, I (obviously) like the idea, but your pep misses some important
> points (and some not so important).
>
> The important one is about the scoping of variables in default
> expressions. The pep says nothing about them.
> If I read the pep correctly, any variables in default expressions are
> handled the same as variables in the body of a function. This means the
> compiler decides if they should be local, lexical or global. If there is
> an assignment to a variable, the compiler makes it a local, else it
> finds the right enclosing scope (lexical or global). In the current
> python, this works fine:
>
> >>> a = 123
> >>> def foo(b=a):
> a = 2
> print a, b
>
> >>> foo()
> 2 123
> >>> a = 42
> >>> foo()
> 2 123
>
> In the pep, the a in the default expression would be handled just like
> any other a in the function body, which means it wil become a _local_
> variable. Calling the function would then result in an
> UnboundLocalError. Just like this in current python:
>
> >>> a = 123
> >>> def foo(b=None):
> b = a if b==None else b
> a = 2
> print a
>
> >>> foo()
>
> Traceback (most recent call last):
> File "<pyshell#22>", line 1, in <module>
> foo()
> File "<pyshell#21>", line 2, in foo
> b = a if b==None else b
> UnboundLocalError: local variable 'a' referenced before assignment
>
> The solution, I think, as I wrote in my previous messages, is to have
> the compiler explicitly make variables in default expressions lexical or
> global variables.
> This would still break the foo() in my example above, because you can't
> assign to a lexical variable, or to a global that isn't declared global.
> Therefore I think the compiler should distinguish between the a in the
> default expression and the a in the function body, and treat them as two
> different variables. You can think about it as if the compiler silently
> renames one of them (without this rename being visible to python code.
> AFAIK the bytecode is stack based and closure vars are put in some kind
> of anonymous cell, which means they both don't actually have a name
> anyway.)
>
> see below for more comments regarding this and other things
>
>
> On Sun, 28 Jan 2007 20:22:44 +0100, Chris Rebert <cvrebert at gmail.com>
> wrote:
>
>> The following is a proto-PEP based on the discussion in the thread
>> "fixing mutable default argument values". Comments would be greatly
>> appreciated.
>> - Chris Rebert
>>
>> Title: Fixing Non-constant Default Arguments
>>
>> Abstract
>>
>> This PEP proposes new semantics for default arguments to remove
>> boilerplate code associated with non-constant default argument values,
>> allowing them to be expressed more clearly and succinctly.
>>
>>
>> Motivation
>>
>> Currently, to write functions using non-constant default arguments,
>> one must use the idiom:
>>
>> def foo(non_const=None):
>> if non_const is None:
>> non_const = some_expr
>> #rest of function
>>
>> or equivalent code. Naive programmers desiring mutable default arguments
>> often make the mistake of writing the following:
>>
>> def foo(mutable=some_expr_producing_mutable):
>> #rest of function
>>
>> However, this does not work as intended, as
>> 'some_expr_producing_mutable' is evaluated only *once* at
>> definition-time, rather than once per call at call-time. This results
>> in all calls to 'foo' using the same default value, which can result in
>> unintended consequences. This necessitates the previously mentioned
>> idiom. This unintuitive behavior is such a frequent stumbling block for
>> newbies that it is present in at least 3 lists of Python's problems [0]
>> [1] [2].
>
> Also, I just found out that python's own documentation refers to this
> with an "Important warning: The default value is evaluated only once.
> This makes a difference when the default is a mutable object such as a
> list, dictionary, or instances of most classes. ..."
> (http://docs.python.org/tut/node6.html#SECTION006710000000000000000)
> this indicates imo that also the python doc writers don't think of the
> current situation as optimal.
>
>> There are currently few, if any, known good uses of the current
>> behavior of mutable default arguments. The most common one is to
>> preserve function state between calls. However, as one of the lists [2]
>> comments, this purpose is much better served by decorators, classes, or
>> (though less preferred) global variables.
>> Therefore, since the current semantics aren't useful for
>> non-constant default values and an idiom is necessary to work around
>> this deficiency, why not change the semantics so that people can write
>> what they mean more directly, without the annoying boilerplate?
>>
>>
>> Rationale
>>
>> Originally, it was proposed that all default argument values be
>> deep-copied from the original (evaluated at definition-time) at each
>> invocation of the function where the default value was required.
>> However, this doesn't take into account default values that are not
>> literals, e.g. function calls, subscripts, attribute accesses. Thus,
>> the new idea was to re-evaluate the default arguments at each call where
>> they were needed. There was some concern over the possible performance
>> hit this could cause, and whether there should be new syntax so that
>> code could use the existing semantics for performance reasons. Some of
>> the proposed syntaxes were:
>>
>> def foo(bar=<baz>):
>> #code
>>
>> def foo(bar=new baz):
>> #code
>>
>> def foo(bar=fresh baz):
>> #code
>>
>> def foo(bar=separate baz):
>> #code
>>
>> def foo(bar=another baz):
>> #code
>>
>> def foo(bar=unique baz):
>> #code
>>
>> where the new keyword (or angle brackets) would indicate that the
>> parameter's default argument should use the new semantics. Other
>> parameters would continue to use the old semantics. It was generally
>> agreed that the angle-bracket syntax was particularly ugly, leading to
>> the proposal of the other syntaxes. However, having 2 different sets of
>> semantics could be confusing and leaving in the old semantics just for
>> performance might be premature optimization. Refactorings to deal with
>> the possible performance hit are discussed below.
>>
>>
>> Specification
>>
>> The current semantics for default arguments are replaced by the
>> following semantics:
>> - Whenever a function is called, and the caller does not provide a
>> value for a parameter with a default expression, the parameter's
>> default expression shall be evaluated in the function's scope. The
>> resulting value shall be assigned to a local variable in the
>> function's scope with the same name as the parameter.
>
> Include something saying that any variables in a default expression
> shall be lexical variables, with their scope being the first outer scope
> that defines a variable with the same name (they should just use the
> same rules as other lexical/closure variables), and that if the function
> body defines a local variable with the same name as a variable in a
> default expression, those variables shall be handled as two separate
> variables.
>
>> - The default argument expressions shall be evaluated before the
>> body of the function.
>> - The evaluation of default argument expressions shall proceed in
>> the same order as that of the parameter list in the function's
>> definition.
>> Given these semantics, it makes more sense to refer to default argument
>> expressions rather than default argument values, as the expression is
>> re-evaluated at each call, rather than just once at definition-time.
>> Therefore, we shall do so hereafter.
>>
>> Demonstrative examples of new semantics:
>> #default argument expressions can refer to
>> #variables in the enclosing scope...
>> CONST = "hi"
>> def foo(a=CONST):
>> print a
>>
>> >>> foo()
>> hi
>> >>> CONST="bye"
>> >>> foo()
>> bye
>>
>> #...or even other arguments
>> def ncopies(container, n=len(container)):
>> return [container for i in range(n)]
>>
>> >>> ncopies([1, 2], 5)
>> [[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]
>> >>> ncopies([1, 2, 3])
>> [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>> >>> #ncopies grabbed n from [1, 2, 3]'s length (3)
>
> I'm not sure if this can be combined elegantly with what I said about
> variables being lexical variables. The first argument to ncopies,
> 'container', is clearly a local variable to ncopies. The 'container' in
> the second arg default expr should, if my comments above are accepted,
> be a lexical variable referring to the 'container' in the global scope.
> The best way to combine the two features seems to be to let 'container'
> be a local var if any of the preceding args is named 'container', and
> let it be a lexically scoped variable otherwise. However, I'm not
> convinced this complexity is worth it and the vars in default
> expressions shouldn't just always be lexical vars.
>
>>
>> #default argument expressions are arbitrary expressions
>> def my_sum(lst):
>> cur_sum = lst[0]
>> for i in lst[1:]: cur_sum += i
>> return cur_sum
>>
>> def bar(b=my_sum((["b"] * (2 * 3))[:4])):
>> print b
>>
>> >>> bar()
>> bbbb
>>
>> #default argument expressions are re-evaluated at every call...
>> from random import randint
>> def baz(c=randint(1,3)):
>> print c
>>
>> >>> baz()
>> 2
>> >>> baz()
>> 3
>>
>> #...but only when they're required
>> def silly():
>> print "spam"
>> return 42
>>
>> def qux(d=silly()):
>> pass
>>
>> >>> qux()
>> spam
>> >>> qux(17)
>> >>> qux(d=17)
>> >>> qux(*[17])
>> >>> qux(**{'d':17})
>> >>> #no output because silly() never called because d's value was
>> specified in the calls
>>
>> #Rule 3
>> count = 0
>> def next():
>> global count
>> count += 1
>> return count - 1
>>
>> def frobnicate(g=next(), h=next(), i=next()):
>> print g, h, i
>>
>> >>> frobnicate()
>> 0 1 2
>> >>> #g, h, and i's default argument expressions are evaluated in
>> the same order as the parameter definition
>>
>>
>> Backwards Compatibility
>>
>> This change in semantics breaks all code which uses mutable default
>> argument values. Such code can be refactored from:
>
> Wow, let's not scare everyone away just yet. This should read:
> "This change in semantics breaks code which uses mutable default
> argument expressions and depends on those expressions being evaluated
> only once, or code that assigns new incompatible values in a parent
> scope to variables used in default expressions"
>
>>
>> def foo(bar=mutable):
>> #code
>
> if 'mutable' is just a single variable, this isn't gonna break, unless
> the global scope decides to do something like this:
>
> def foo(bar=mutable):
> #code
>
> mutable = incompatible_mutable
> # ...
> foo()
>
>>
>> to
>>
>> def stateify(state):
>> def _wrap(func):
>> def _wrapper(*args, **kwds):
>> kwds['bar'] = state
>> return func(*args, **kwds)
>> return _wrapper
>> return _wrap
>>
>> @stateify(mutable)
>> def foo(bar):
>> #code
>>
>> or
>>
>> state = mutable
>> def foo(bar=state):
>> #code
>>
>> or
>>
>> class Baz(object):
>> def __init__(self):
>> self.state = mutable
>>
>> def foo(self, bar=self.state):
>> #code
>
> Minor point: the stateify decorator looks a bit scary to me as it uses
> three levels of nested functions. (that's inherent to decorators, but
> still.) Suggest you name the class and global var solutions first, and
> the decorator as last, just to prevent people from stopping reading the
> pep and voting '-1' right when they hit the decorator solution.
>
>>
>> The changes in this PEP are backwards-compatible with all code whose
>> default argument values are immutable
>
> ...or don't depend on being evaluated only once, and don't modify in a
> parent scope the variables in default expressions in an incompatible way,
>
> (hmm, the 'or' and 'and' may need some disambiguation parentheses...)
>
>> including code using the idiom
>> mentioned in the 'Motivation' section. However, such values will now be
>> recomputed for each call for which they are required. This may cause
>> performance degradation. If such recomputation is significantly
>> expensive, the same refactorings mentioned above can be used.
>>
>> In relation to Python 3.0, this PEP's proposal is compatible with
>> those of PEP 3102 [3] and PEP 3107 [4]. Also, this PEP does not depend
>> on the acceptance of either of those PEPs.
>>
>>
>> Reference Implementation
>>
>> All code of the form:
>>
>> def foo(bar=some_expr, baz=other_expr):
>> #body
>>
>> Should act as if it had read (in pseudo-Python):
>>
>> def foo(bar=_undefined, baz=_undefined):
>> if bar is _undefined:
>> bar = some_expr
>> if baz is _undefined:
>> baz = other_expr
>> #body
>
> and, if there are any variables occuring in the function body and in
> some_expr or other_expr, rename those in the function body to something
> that doesn't name-clash.
>
>>
>> where _undefined is the value given to a parameter when the caller
>> didn't specify a value for it. This is not intended to be a literal
>> translation, but rather a demonstration as to how Python's internal
>> argument-handling machinery should be changed.
>>
>>
>> References
>>
>> [0] 10 Python pitfalls
>> http://zephyrfalcon.org/labs/python_pitfalls.html
>>
>> [1] Python Gotchas
>> http://www.ferg.org/projects/python_gotchas.html#contents_item_6
>>
>> [2] When Pythons Attack
>>
>> http://www.onlamp.com/pub/a/python/2004/02/05/learn_python.html?page=2
>>
>> [3] Keyword-Only Arguments
>> http://www.python.org/dev/peps/pep-3102/
>>
>> [4] Function Annotations
>> http://www.python.org/dev/peps/pep-3107/
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
> I'm not quite sure if your decision not to include the
> default-expr-vars-become-lexical part was intentional or not. If it is,
> can you tell me why you'd want that? Else you can incorporate my
> comments in the pep.
> Another minor point: I'm personally not too fond of too many 'shall's
> close together. It makes me think of lots of bureaucracy and
> design-by-committee. I think several peps don't have a shall in them,
> but maybe it is the right language for this one. What's the right
> language to use in peps?
>
> Well, that's about it I think.
>
> - Jan
>
More information about the Python-ideas
mailing list