[Python-ideas] proto-PEP: Fixing Non-constant Default Arguments

Chris Rebert cvrebert at gmail.com
Tue Jan 30 08:09:37 CET 2007


Wow, that's a lot to think about.
Yes, the exact nature of the variables in default arguments does need 
clarification. I'll probably go with something like your proposal, but I 
need to consider a few things, particularly the 'ncopies' situation.
I added a reference to the Python documentation you mentioned. Thanks!

 >>      def foo(bar=mutable):
 >>          #code
 >
 > if 'mutable' is just a single variable, this isn't gonna break, unless
 > the global scope decides to do something like this:
 >
 > def foo(bar=mutable):
 >     #code
 >
 > mutable = incompatible_mutable
 > # ...
 > foo()

Actually, I was considering the case where mutable is a constant (e.g. a 
list), in which case it *will* break since the expr 'mutable' will be 
re-evaluated at every call, so modifying it won't have the same effect 
on future calls that it used to.

I included your rephrasing under "Backwards Compatibility".
I shuffled around the refactorings as you suggested.
My decision not to include the default-expr-vars-become-lexical part was 
completely unintentional, as you can guess from above.
I also reworded the sentences using 'shall'. I don't really get why that 
matters, but what the heck.

Thanks for your useful comments and suggestions.

- Chris Rebert

Jan Kanis wrote:
> Well, I (obviously) like the idea, but your pep misses some important 
> points (and some not so important).
> 
> The important one is about the scoping of variables in default 
> expressions. The pep says nothing about them.
> If I read the pep correctly, any variables in default expressions are 
> handled the same as variables in the body of a function. This means the 
> compiler decides if they should be local, lexical or global. If there is 
> an assignment to a variable, the compiler makes it a local, else it 
> finds the right enclosing scope (lexical or global). In the current 
> python, this works fine:
> 
>  >>> a = 123
>  >>> def foo(b=a):
>           a = 2
>           print a, b
> 
>  >>> foo()
>  2 123
>  >>> a = 42
>  >>> foo()
>  2 123
> 
> In the pep, the a in the default expression would be handled just like 
> any other a in the function body, which means it wil become a _local_ 
> variable. Calling the function would then result in an 
> UnboundLocalError. Just like this in current python:
> 
>  >>> a = 123
>  >>> def foo(b=None):
>           b = a if b==None else b
>           a = 2
>           print a
> 
>  >>> foo()
> 
>  Traceback (most recent call last):
>    File "<pyshell#22>", line 1, in <module>
>      foo()
>    File "<pyshell#21>", line 2, in foo
>      b = a if b==None else b
>  UnboundLocalError: local variable 'a' referenced before assignment
> 
> The solution, I think, as I wrote in my previous messages, is to have 
> the compiler explicitly make variables in default expressions lexical or 
> global variables.
> This would still break the foo() in my example above, because you can't 
> assign to a lexical variable, or to a global that isn't declared global. 
> Therefore I think the compiler should distinguish between the a in the 
> default expression and the a in the function body, and treat them as two 
> different variables. You can think about it as if the compiler silently 
> renames one of them (without this rename being visible to python code.  
> AFAIK the bytecode is stack based and closure vars are put in some kind 
> of anonymous cell, which means they both don't actually have a name 
> anyway.)
> 
> see below for more comments regarding this and other things
> 
> 
> On Sun, 28 Jan 2007 20:22:44 +0100, Chris Rebert <cvrebert at gmail.com> 
> wrote:
> 
>> The following is a proto-PEP based on the discussion in the thread
>> "fixing mutable default argument values". Comments would be greatly
>> appreciated.
>> - Chris Rebert
>>
>> Title: Fixing Non-constant Default Arguments
>>
>> Abstract
>>
>>      This PEP proposes new semantics for default arguments to remove
>> boilerplate code associated with non-constant default argument values,
>> allowing them to be expressed more clearly and succinctly.
>>
>>
>> Motivation
>>
>>      Currently, to write functions using non-constant default arguments,
>> one must use the idiom:
>>
>>      def foo(non_const=None):
>>          if non_const is None:
>>              non_const = some_expr
>>          #rest of function
>>
>> or equivalent code. Naive programmers desiring mutable default arguments
>> often make the mistake of writing the following:
>>
>>      def foo(mutable=some_expr_producing_mutable):
>>          #rest of function
>>
>> However, this does not work as intended, as
>> 'some_expr_producing_mutable' is evaluated only *once* at
>> definition-time, rather than once per call at call-time.  This results
>> in all calls to 'foo' using the same default value, which can result in
>> unintended consequences.  This necessitates the previously mentioned
>> idiom. This unintuitive behavior is such a frequent stumbling block for
>> newbies that it is present in at least 3 lists of Python's problems [0]
>> [1] [2].
> 
> Also, I just found out that python's own documentation refers to this 
> with an "Important warning: The default value is evaluated only once. 
> This makes a difference when the default is a mutable object such as a 
> list, dictionary, or instances of most classes. ..."
> (http://docs.python.org/tut/node6.html#SECTION006710000000000000000)
> this indicates imo that also the python doc writers don't think of the 
> current situation as optimal.
> 
>>      There are currently few, if any, known good uses of the current
>> behavior of mutable default arguments.  The most common one is to
>> preserve function state between calls.  However, as one of the lists [2]
>> comments, this purpose is much better served by decorators, classes, or
>> (though less preferred) global variables.
>>      Therefore, since the current semantics aren't useful for
>> non-constant default values and an idiom is necessary to work around
>> this deficiency, why not change the semantics so that people can write
>> what they mean more directly, without the annoying boilerplate?
>>
>>
>> Rationale
>>
>>      Originally, it was proposed that all default argument values be
>> deep-copied from the original (evaluated at definition-time) at each
>> invocation of the function where the default value was required.
>> However, this doesn't take into account default values that are not
>> literals, e.g. function calls, subscripts, attribute accesses.  Thus,
>> the new idea was to re-evaluate the default arguments at each call where
>> they were needed. There was some concern over the possible performance
>> hit this could cause, and whether there should be new syntax so that
>> code could use the existing semantics for performance reasons.  Some of
>> the proposed syntaxes were:
>>
>>      def foo(bar=<baz>):
>>          #code
>>
>>      def foo(bar=new baz):
>>          #code
>>
>>      def foo(bar=fresh baz):
>>          #code
>>
>>      def foo(bar=separate baz):
>>          #code
>>
>>      def foo(bar=another baz):
>>          #code
>>
>>      def foo(bar=unique baz):
>>          #code
>>
>> where the new keyword (or angle brackets) would indicate that the
>> parameter's default argument should use the new semantics.  Other
>> parameters would continue to use the old semantics. It was generally
>> agreed that the angle-bracket syntax was particularly ugly, leading to
>> the proposal of the other syntaxes. However, having 2 different sets of
>> semantics could be confusing and leaving in the old semantics just for
>> performance might be premature optimization. Refactorings to deal with
>> the possible performance hit are discussed below.
>>
>>
>> Specification
>>
>>      The current semantics for default arguments are replaced by the
>> following semantics:
>>      - Whenever a function is called, and the caller does not provide a
>>      value for a parameter with a default expression, the parameter's
>>      default expression shall be evaluated in the function's scope.  The
>>      resulting value shall be assigned to a local variable in the
>>      function's scope with the same name as the parameter.
> 
> Include something saying that any variables in a default expression 
> shall be lexical variables, with their scope being the first outer scope 
> that defines a variable with the same name (they should just use the 
> same rules as other lexical/closure variables), and that if the function 
> body defines a local variable with the same name as a variable in a 
> default expression, those variables shall be handled as two separate 
> variables.
> 
>>      - The default argument expressions shall be evaluated before the
>>      body of the function.
>>      - The evaluation of default argument expressions shall proceed in
>>      the same order as that of the parameter list in the function's
>>      definition.
>> Given these semantics, it makes more sense to refer to default argument
>> expressions rather than default argument values, as the expression is
>> re-evaluated at each call, rather than just once at definition-time.
>> Therefore, we shall do so hereafter.
>>
>> Demonstrative examples of new semantics:
>>      #default argument expressions can refer to
>>      #variables in the enclosing scope...
>>      CONST = "hi"
>>      def foo(a=CONST):
>>          print a
>>
>>      >>> foo()
>>      hi
>>      >>> CONST="bye"
>>      >>> foo()
>>      bye
>>
>>      #...or even other arguments
>>      def ncopies(container, n=len(container)):
>>          return [container for i in range(n)]
>>
>>      >>> ncopies([1, 2], 5)
>>      [[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]
>>      >>> ncopies([1, 2, 3])
>>      [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>>      >>> #ncopies grabbed n from [1, 2, 3]'s length (3)
> 
> I'm not sure if this can be combined elegantly with what I said about 
> variables being lexical variables. The first argument to ncopies, 
> 'container', is clearly a local variable to ncopies. The 'container' in 
> the second arg default expr should, if my comments above are accepted, 
> be a lexical variable referring to the 'container' in the global scope.
> The best way to combine the two features seems to be to let 'container' 
> be a local var if any of the preceding args is named 'container', and 
> let it be a lexically scoped variable otherwise. However, I'm not 
> convinced this complexity is worth it and the vars in default 
> expressions shouldn't just always be lexical vars.
> 
>>
>>      #default argument expressions are arbitrary expressions
>>      def my_sum(lst):
>>          cur_sum = lst[0]
>>          for i in lst[1:]: cur_sum += i
>>          return cur_sum
>>
>>      def bar(b=my_sum((["b"] * (2 * 3))[:4])):
>>          print b
>>
>>      >>> bar()
>>      bbbb
>>
>>      #default argument expressions are re-evaluated at every call...
>>      from random import randint
>>      def baz(c=randint(1,3)):
>>          print c
>>
>>      >>> baz()
>>      2
>>      >>> baz()
>>      3
>>
>>      #...but only when they're required
>>      def silly():
>>          print "spam"
>>          return 42
>>
>>      def qux(d=silly()):
>>          pass
>>
>>      >>> qux()
>>      spam
>>      >>> qux(17)
>>      >>> qux(d=17)
>>      >>> qux(*[17])
>>      >>> qux(**{'d':17})
>>      >>> #no output because silly() never called because d's value was
>> specified in the calls
>>
>>      #Rule 3
>>      count = 0
>>      def next():
>>          global count
>>          count += 1
>>          return count - 1
>>
>>      def frobnicate(g=next(), h=next(), i=next()):
>>          print g, h, i
>>
>>      >>> frobnicate()
>>      0 1 2
>>      >>> #g, h, and i's default argument expressions are evaluated in
>> the same order as the parameter definition
>>
>>
>> Backwards Compatibility
>>
>>      This change in semantics breaks all code which uses mutable default
>> argument values. Such code can be refactored from:
> 
> Wow, let's not scare everyone away just yet. This should read:
> "This change in semantics breaks code which uses mutable default 
> argument expressions and depends on those expressions being evaluated 
> only once, or code that assigns new incompatible values in a parent 
> scope to variables used in default expressions"
> 
>>
>>      def foo(bar=mutable):
>>          #code
> 
> if 'mutable' is just a single variable, this isn't gonna break, unless 
> the global scope decides to do something like this:
> 
> def foo(bar=mutable):
>     #code
> 
> mutable = incompatible_mutable
> # ...
> foo()
> 
>>
>> to
>>
>>      def stateify(state):
>>          def _wrap(func):
>>              def _wrapper(*args, **kwds):
>>                  kwds['bar'] = state
>>                  return func(*args, **kwds)
>>              return _wrapper
>>          return _wrap
>>
>>      @stateify(mutable)
>>      def foo(bar):
>>          #code
>>
>> or
>>
>>      state = mutable
>>      def foo(bar=state):
>>          #code
>>
>> or
>>
>>      class Baz(object):
>>          def __init__(self):
>>              self.state = mutable
>>
>>          def foo(self, bar=self.state):
>>              #code
> 
> Minor point: the stateify decorator looks a bit scary to me as it uses 
> three levels of nested functions. (that's inherent to decorators, but 
> still.) Suggest you name the class and global var solutions first, and 
> the decorator as last, just to prevent people from stopping reading the 
> pep and voting '-1' right when they hit the decorator solution.
> 
>>
>> The changes in this PEP are backwards-compatible with all code whose
>> default argument values are immutable
> 
> ...or don't depend on being evaluated only once, and don't modify in a 
> parent scope the variables in default expressions in an incompatible way,
> 
> (hmm, the 'or' and 'and' may need some disambiguation parentheses...)
> 
>> including code using the idiom
>> mentioned in the 'Motivation' section. However, such values will now be
>> recomputed for each call for which they are required. This may cause
>> performance degradation. If such recomputation is significantly
>> expensive, the same refactorings mentioned above can be used.
>>
>>      In relation to Python 3.0, this PEP's proposal is compatible with
>> those of PEP 3102 [3] and PEP 3107 [4]. Also, this PEP does not depend
>> on the acceptance of either of those PEPs.
>>
>>
>> Reference Implementation
>>
>>      All code of the form:
>>
>>          def foo(bar=some_expr, baz=other_expr):
>>              #body
>>
>>      Should act as if it had read (in pseudo-Python):
>>
>>          def foo(bar=_undefined, baz=_undefined):
>>              if bar is _undefined:
>>                  bar = some_expr
>>              if baz is _undefined:
>>                  baz = other_expr
>>              #body
> 
> and, if there are any variables occuring in the function body and in 
> some_expr or other_expr, rename those in the function body to something 
> that doesn't name-clash.
> 
>>
>> where _undefined is the value given to a parameter when the caller
>> didn't specify a value for it. This is not intended to be a literal
>> translation, but rather a demonstration as to how Python's internal
>> argument-handling machinery should be changed.
>>
>>
>> References
>>
>>      [0] 10 Python pitfalls
>>          http://zephyrfalcon.org/labs/python_pitfalls.html
>>
>>      [1] Python Gotchas
>>          http://www.ferg.org/projects/python_gotchas.html#contents_item_6
>>
>>      [2] When Pythons Attack
>>      
>> http://www.onlamp.com/pub/a/python/2004/02/05/learn_python.html?page=2
>>
>>      [3] Keyword-Only Arguments
>>          http://www.python.org/dev/peps/pep-3102/
>>
>>      [4] Function Annotations
>>          http://www.python.org/dev/peps/pep-3107/
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
> 
> I'm not quite sure if your decision not to include the 
> default-expr-vars-become-lexical part was intentional or not. If it is, 
> can you tell me why you'd want that? Else you can incorporate my 
> comments in the pep.
> Another minor point: I'm personally not too fond of too many 'shall's 
> close together. It makes me think of lots of bureaucracy and 
> design-by-committee. I think several peps don't have a  shall in them, 
> but maybe it is the right language for this one. What's the right 
> language to use in peps?
> 
> Well, that's about it I think.
> 
> - Jan
> 



More information about the Python-ideas mailing list