[Python-3000] pre-PEP: Default Argument Expressions

Chris Rebert cvrebert at gmail.com
Wed Feb 14 04:25:55 CET 2007


Requesting comments on the following pre-PEP. pybench runs both with and 
without the patch applied would also be appreciated.
- Chris R


Title: Default Argument Expressions
Author: Christopher Rebert <cvrebertatgmaildotcom>
Status: Draft
Type: Standards Track
Requires: 3000
Python-Version: 3.0

Abstract

     This PEP proposes new semantics for default arguments to remove
     boilerplate code associated with non-constant default argument values,
     allowing them to be expressed more clearly and succinctly. 
Specifically,
     all default argument expressions are re-evaluated at each call as 
opposed
     to just once at definition-time as they are now.


Motivation

     Currently, to write functions using non-constant default arguments, one
     must use the idiom:

         def foo(non_const=None):
             if non_const is None:
                 non_const = some_expr
             #rest of function

    or equivalent code.  Naive programmers desiring mutable default 
arguments
    often make the mistake of writing the following:

         def foo(mutable=some_expr_producing_mutable):
             #rest of function

     However, this does not work as intended, as 
'some_expr_producing_mutable'
     is evaluated only *once* at definition-time, rather than once per 
call at
     call-time.  This results in all calls to 'foo' using the same default
     value, which can result in unintended consequences.  This 
necessitates the
     previously mentioned idiom.  This unintuitive behavior is such a 
frequent
     stumbling block for newbies that it is present in at least 3 lists of
     Python's deficiencies [0] [1] [2].  Python's tutorial even mentions the
     issue explicitly [3].
     There are currently few, if any, known good uses of the current 
behavior
     of mutable default arguments.  The most common one is to preserve 
function
     state between calls.  However, as one of the lists [2] comments, this
     purpose is much better served by decorators, classes, or (though less
     preferred) global variables.
     Therefore, since the current semantics aren't useful for non-constant
     default values and an idiom is necessary to work around this 
deficiency,
     why not change the semantics so that people can write what they 
mean more
     directly, without the tedious boilerplate? Removing this idiom 
would help
     make code more readable and self-documenting.


Rationale

     The discussion referenced herein is based on two threads [4] [5] on the
     python-ideas mailing list.
     Originally, it was proposed that all default argument values be
     deep-copied from the original (evaluated at definition-time) at each
     invocation of the function where the default value was required. 
However,
     this doesn't take into account default values that are not 
literals, e.g.
     function calls, subscripts, attribute accesses.  Thus, the new idea 
was to
     re-evaluate the default arguments at each call where they were needed.
     There was some concern over the possible performance hit this could 
cause,
     and whether there should be new syntax so that code could use the 
existing
     semantics for performance reasons.  Some of the proposed syntaxes were:

         def foo(bar=<baz>):
             #code

         def foo(bar=new baz):
             #code

         def foo(bar=fresh baz):
             #code

         def foo(bar=separate baz):
             #code

         def foo(bar=another baz):
             #code

         def foo(bar=unique baz):
             #code

         def foo(bar or baz):
             #code

     where the keyword (or angle brackets) would indicate that the
     default value 'baz' of parameter 'bar' should use the new semantics.
     Other parameters would continue to use the old semantics.

     Alternately, the new semantics could be the default, with the old
     semantics accessible using:

         def foo(bar=once baz):
             #code

     Where 'once' indicates the old default argument semantics. A 
similar idea
     is mentioned in PEP 3103 [6] under "Option 4".  However, having two 
sets
     of semantics could be confusing, and leaving in the old semantics 
might be
     considered premature optimization.  So this PEP proposed having 
just one
     set of semantics.  Refactorings to deal with the possible 
performance hit
     from the new semantics are discussed later.

     A more radical proposed solution was to restrict default arguments to
     being hash()-able values, thus theoretically restricting default 
arguments
     to immutable values only.  While this would solve the newbie-confusion
     issue, it does not suggest a better way to specify that a default value
     should be recomputed at every function call.

     Throughout the discussion, several decorators were shown as 
alternatives
     to the aforementioned idiom.  These do allow the programmer to express
     their intent more clearly, at the cost of some extra complexity. 
Also, no
     one generator could be applied to all situations.  The programmer would
     have to figure out which one to use each time.  This PEP's proposed
     solution would make these decorators unnecessary and allow a more 
general
     solution to the issue than these decorators.  The question was also 
raised
     as to whether the problem this PEP seeks to solve is significant 
enough to
     warrant a language change.  The statistics in the Compatibility Issues
     section should help demonstrate the necessity of the changes that 
this PEP
     proposes.

     The next question was exactly how default variable expressions 
should be
     scoped.  By way of demonstration:

         a = 42
         def foo(b=a):
             a = 3.14

     Now, does the variable 'a' in the default expression for 'b' refer 
to the
     lexical variable 'a', or the local variable 'a'?  If it refers to a 
local
     variable, then this code is basically equivalent to:

         a = 42
         def foo(b=None):
             if b is None:
                 b = a
             a = 3.14

     in which case, 'a' is being referenced before it's been assigned to 
in the
     function, causing an UnboundLocalError.  The alternative is to have 
Python
     treat 'a' within the function's body differently from the 'a' in the\
     default expression.  In this case, the code would behave as if it were:

         a = 42
         def foo(b=None):
             if b is None:
                 b = __a
             a = 3.14

     where __a indicates Python 'magically' treating it as a lexical 
variable
     that is distinct from the local variable 'a'.  This would increase
     backward-compatibility, allowing you to use a lexical variable with the
     same name as a local variable as a default expression, which is more
     similar to Python's current behavior.  However, this would 
complicate the
     semantics of default expressions.  For simplicity's sake, this PEP
     endorses treating variables in default expressions as normal function
     variables.  Suggestions for dealing with the incompatibilities this 
would
     introduce are discussed later.


Specification

     The current semantics for default arguments are replaced by the 
following
     semantics:
         - Whenever a function is called, and the caller does not provide a
         value for a parameter with a default expression, the parameter's
         default expression is evaluated in the function's scope.  The
         resulting value is then assigned to a local variable in the
         function's scope with the same name as the parameter.
         - The default argument expressions are evaluated before the body
         of the function.
         - The evaluation of default argument expressions proceeds in the
         same order as that of the parameter list in the function's 
definition.
         - Variables in a default expression are be treated like normal
         function variables (i.e. global/lexical variables unless 
assigned to
         in the function).
     Given these semantics, it makes more sense to refer to default argument
     expressions rather than default argument values, as the expression is
     re-evaluated at each call, rather than just once at definition-time.
     Therefore, we shall do so hereafter.

     Demonstrative examples:
         #default argument expressions can refer to
         #variables in the enclosing scope...
         CONST = "hi"
         def foo(a=CONST):
             print a

         >>> foo()
         hi
         >>> CONST="bye"
         >>> foo()
         bye

         #...or even other arguments
         def ncopies(container, n=len(container)):
             return [container for i in range(n)]

         >>> ncopies([1, 2], 5)
         [[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]
         >>> ncopies([1, 2, 3])
         [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
         >>> #ncopies grabbed n from [1, 2, 3]'s length (3)

         #default argument expressions are arbitrary expressions
         def my_sum(lst):
             cur_sum = lst[0]
             for i in lst[1:]: cur_sum += i
             return cur_sum

         def bar(b=my_sum((["b"] * (2 * 3))[:4])):
             print b

         >>> bar()
         bbbb

         #default argument expressions are re-evaluated at every call...
         from random import randint
         def baz(c=randint(1,3)):
             print c

         >>> baz()
         2
         >>> baz()
         3

         #...but only when they're required
         def silly():
             print "spam"
             return 42

         def qux(d=silly()):
             pass

         >>> qux()
         spam
         >>> qux(17)
         >>> qux(d=17)
         >>> qux(*[17])
         >>> qux(**{'d':17})
         >>> #no output since silly() never called
         >>> #because d's value was specified in the calls

         #default argument expressions are evaluated in calling sequence 
order
         count = 0
         def next():
             global count
             count += 1
             return count - 1

         def frobnicate(g=next(), h=next(), i=next()):
             print g, h, i

         >>> frobnicate()
         0 1 2
         >>> #g, h, and i's default argument expressions are evaluated
         >>> #in the same order as in the parameter definition

         #variables in default expressions refer to lexical/global 
variables...
         j = "holy grail"
         def frenchy(k=j):
             print j
         #...unless assigned to in the function (or its parameters)
         def arthur(j="swallow", m=j):
             print m

         >>> frenchy()
         holy grail
         >>> arthur()
         swallow


Compatibility Issues

     This change in semantics breaks code which uses mutable default 
argument
     expressions and depends on those expressions being evaluated only once.
     It also will break code that assigns new incompatible values in a 
parent
     scope to variables used in default expressions.  Code relying on such
     behavior can be refactored from:

         def foo(bar=mutable):
             #code

     to

         state = mutable
         def foo(bar=state):
             #code

     or

         class Baz(object):
             state = mutable

             @classmethod
             def foo(cls, bar=cls.state):
                 #code

     or

         from functools import wraps

         def stateify(states):
             def _wrap(func):
                 @wraps(func)
                 def _wrapper(*args, **kwds):
                     new_kwargs = states.copy()
                     new_kwargs.update(kwds)
                     return func(*args, **new_kwargs)
                 return _wrapper
             return _wrap

         @stateify({'bar' : mutable})
         def foo(bar):
             #code

     Code such as the following (which was also mentioned in the Rationale):

         b = 42 #outer b
         def foo(a=b): #ERROR: refers to local b, not outer b!
             b = 7 #local b

     which has default values that refer to variables in enclosing 
scopes and
     contains assignments to local variables of the same names will also be
     incompatible, as the 'b' in the default argument refers to the 
local 'b'
     rather than the outer 'b', resulting in an UnboundLocalError 
because the
     local variable 'b' has not been assigned to at the time "a"'s default
     expression is evaluated.  Such code will need to rename the affected
     variables.

     The changes in this PEP are backwards-compatible with all code whose
     default argument values are immutable, including code using the idiom
     mentioned in the 'Motivation' section.  However, such values will 
now be
     recomputed for each call for which they are required.  This may cause
     performance degradation.  If such recomputation is significantly
     expensive, the same refactoring mentioned above can be used.

     A survey of the standard library for Python v2.5, produced via a
     script [7], gave the following statistics for the standard library
     (608 files, test suites were excluded):

         total number of non-None immutable default arguments: 1585 (41.5%)
         total number of mutable default arguments: 186 (4.9%)
         total number of default arguments with a value of None: 1813 
(47.4%)
         total number of default arguments with unknown mutability: 238 
(6.2%)
         total number of comparisons to None: 940

     Note: The number of comparisons to None refers to *all* such 
comparisons,
     not necessarily just those used in the idiom mentioned in the 
Motivation
     section.

     Looking more closely at the script's output, it appears that Tix.py and
     Tkinter.py are the primary users of mutable default arguments in the
     standard library.

     Similarly, examination of the unknown default arguments reveals that a
     significant fraction are functions, classes, or constants, which 
should, for
     the most part, not be functionally affected by this proposal

     Assuming the standard library is indicative of Python code in 
general, the
     change in semantics will have comparatively little impact on the 
correct
     operation of Python programs.

     Running pybench with modifications to simulate the proposed 
semantics [8]
     shows that Python function/method calls using default arguments run 
about
     4.4%-6.5% slower versus the current semantics.  However, as the 
simulation
     of the proposed semantics is crude, this should be considered an upper
     bound for any performance decreases this proposal might cause.

     In relation to Python 3.0, this PEP's proposal is compatible with 
those of
     PEP 3102 [9] and PEP 3107 [10], though it does not depend on the
     acceptance of either of those PEPs.


Reference Implementation

     All code of the form:

         def foo(bar=some_expr, baz=other_expr):
             #body

     Should be compiled as if it had read (in pseudo-Python):

         def foo(bar=_undefined, baz=_undefined):
             if bar is _undefined:
                 bar = some_expr
             if baz is _undefined:
                 baz = other_expr
             #body

     where '_undefined' is the value given to a parameter when the caller
     didn't specify a value for it.  This is not intended to be a literal
     translation, but rather a demonstration as to how Python's
     argument-handling machinery should act.  Specifically, there should 
be no
     Python-level value corresponding to _undefined, nor should a literal
     translation such as that shown necessarily be used.


References

     [0] 10 Python pitfalls
         http://zephyrfalcon.org/labs/python_pitfalls.html

     [1] Python Gotchas
         http://www.ferg.org/projects/python_gotchas.html#contents_item_6

     [2] When Pythons Attack
 
http://www.onlamp.com/pub/a/python/2004/02/05/learn_python.html?page=2

     [3] 4. More Control Flow Tools
         http://docs.python.org/tut/node6.html#SECTION006710000000000000000

     [4] [Python-ideas] fixing mutable default argument values
 
http://mail.python.org/pipermail/python-ideas/2007-January/000073.html

     [5] [Python-ideas] proto-PEP: Fixing Non-constant Default Arguments
 
http://mail.python.org/pipermail/python-ideas/2007-January/000121.html

     [6] A Switch/Case Statement
         http://www.python.org/dev/peps/pep-3103/

     [7] Script to generate default argument statistics
         See attachment.

     [8] Patch to pybench/Calls.py
         See attachment.

     [9] Keyword-Only Arguments
         http://www.python.org/dev/peps/pep-3102/

     [10] Function Annotations
         http://www.python.org/dev/peps/pep-3107/


Copyright

     This document has been placed in the public domain.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: defargs.diff
Type: text/x-patch
Size: 794 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070213/12125965/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: new_find.py
Type: text/x-python
Size: 4245 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070213/12125965/attachment.py 


More information about the Python-3000 mailing list