[Python-3000] pre-PEP: Default Argument Expressions
Chris Rebert
cvrebert at gmail.com
Wed Feb 14 04:25:55 CET 2007
Requesting comments on the following pre-PEP. pybench runs both with and
without the patch applied would also be appreciated.
- Chris R
Title: Default Argument Expressions
Author: Christopher Rebert <cvrebertatgmaildotcom>
Status: Draft
Type: Standards Track
Requires: 3000
Python-Version: 3.0
Abstract
This PEP proposes new semantics for default arguments to remove
boilerplate code associated with non-constant default argument values,
allowing them to be expressed more clearly and succinctly.
Specifically,
all default argument expressions are re-evaluated at each call as
opposed
to just once at definition-time as they are now.
Motivation
Currently, to write functions using non-constant default arguments, one
must use the idiom:
def foo(non_const=None):
if non_const is None:
non_const = some_expr
#rest of function
or equivalent code. Naive programmers desiring mutable default
arguments
often make the mistake of writing the following:
def foo(mutable=some_expr_producing_mutable):
#rest of function
However, this does not work as intended, as
'some_expr_producing_mutable'
is evaluated only *once* at definition-time, rather than once per
call at
call-time. This results in all calls to 'foo' using the same default
value, which can result in unintended consequences. This
necessitates the
previously mentioned idiom. This unintuitive behavior is such a
frequent
stumbling block for newbies that it is present in at least 3 lists of
Python's deficiencies [0] [1] [2]. Python's tutorial even mentions the
issue explicitly [3].
There are currently few, if any, known good uses of the current
behavior
of mutable default arguments. The most common one is to preserve
function
state between calls. However, as one of the lists [2] comments, this
purpose is much better served by decorators, classes, or (though less
preferred) global variables.
Therefore, since the current semantics aren't useful for non-constant
default values and an idiom is necessary to work around this
deficiency,
why not change the semantics so that people can write what they
mean more
directly, without the tedious boilerplate? Removing this idiom
would help
make code more readable and self-documenting.
Rationale
The discussion referenced herein is based on two threads [4] [5] on the
python-ideas mailing list.
Originally, it was proposed that all default argument values be
deep-copied from the original (evaluated at definition-time) at each
invocation of the function where the default value was required.
However,
this doesn't take into account default values that are not
literals, e.g.
function calls, subscripts, attribute accesses. Thus, the new idea
was to
re-evaluate the default arguments at each call where they were needed.
There was some concern over the possible performance hit this could
cause,
and whether there should be new syntax so that code could use the
existing
semantics for performance reasons. Some of the proposed syntaxes were:
def foo(bar=<baz>):
#code
def foo(bar=new baz):
#code
def foo(bar=fresh baz):
#code
def foo(bar=separate baz):
#code
def foo(bar=another baz):
#code
def foo(bar=unique baz):
#code
def foo(bar or baz):
#code
where the keyword (or angle brackets) would indicate that the
default value 'baz' of parameter 'bar' should use the new semantics.
Other parameters would continue to use the old semantics.
Alternately, the new semantics could be the default, with the old
semantics accessible using:
def foo(bar=once baz):
#code
Where 'once' indicates the old default argument semantics. A
similar idea
is mentioned in PEP 3103 [6] under "Option 4". However, having two
sets
of semantics could be confusing, and leaving in the old semantics
might be
considered premature optimization. So this PEP proposed having
just one
set of semantics. Refactorings to deal with the possible
performance hit
from the new semantics are discussed later.
A more radical proposed solution was to restrict default arguments to
being hash()-able values, thus theoretically restricting default
arguments
to immutable values only. While this would solve the newbie-confusion
issue, it does not suggest a better way to specify that a default value
should be recomputed at every function call.
Throughout the discussion, several decorators were shown as
alternatives
to the aforementioned idiom. These do allow the programmer to express
their intent more clearly, at the cost of some extra complexity.
Also, no
one generator could be applied to all situations. The programmer would
have to figure out which one to use each time. This PEP's proposed
solution would make these decorators unnecessary and allow a more
general
solution to the issue than these decorators. The question was also
raised
as to whether the problem this PEP seeks to solve is significant
enough to
warrant a language change. The statistics in the Compatibility Issues
section should help demonstrate the necessity of the changes that
this PEP
proposes.
The next question was exactly how default variable expressions
should be
scoped. By way of demonstration:
a = 42
def foo(b=a):
a = 3.14
Now, does the variable 'a' in the default expression for 'b' refer
to the
lexical variable 'a', or the local variable 'a'? If it refers to a
local
variable, then this code is basically equivalent to:
a = 42
def foo(b=None):
if b is None:
b = a
a = 3.14
in which case, 'a' is being referenced before it's been assigned to
in the
function, causing an UnboundLocalError. The alternative is to have
Python
treat 'a' within the function's body differently from the 'a' in the\
default expression. In this case, the code would behave as if it were:
a = 42
def foo(b=None):
if b is None:
b = __a
a = 3.14
where __a indicates Python 'magically' treating it as a lexical
variable
that is distinct from the local variable 'a'. This would increase
backward-compatibility, allowing you to use a lexical variable with the
same name as a local variable as a default expression, which is more
similar to Python's current behavior. However, this would
complicate the
semantics of default expressions. For simplicity's sake, this PEP
endorses treating variables in default expressions as normal function
variables. Suggestions for dealing with the incompatibilities this
would
introduce are discussed later.
Specification
The current semantics for default arguments are replaced by the
following
semantics:
- Whenever a function is called, and the caller does not provide a
value for a parameter with a default expression, the parameter's
default expression is evaluated in the function's scope. The
resulting value is then assigned to a local variable in the
function's scope with the same name as the parameter.
- The default argument expressions are evaluated before the body
of the function.
- The evaluation of default argument expressions proceeds in the
same order as that of the parameter list in the function's
definition.
- Variables in a default expression are be treated like normal
function variables (i.e. global/lexical variables unless
assigned to
in the function).
Given these semantics, it makes more sense to refer to default argument
expressions rather than default argument values, as the expression is
re-evaluated at each call, rather than just once at definition-time.
Therefore, we shall do so hereafter.
Demonstrative examples:
#default argument expressions can refer to
#variables in the enclosing scope...
CONST = "hi"
def foo(a=CONST):
print a
>>> foo()
hi
>>> CONST="bye"
>>> foo()
bye
#...or even other arguments
def ncopies(container, n=len(container)):
return [container for i in range(n)]
>>> ncopies([1, 2], 5)
[[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]
>>> ncopies([1, 2, 3])
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>>> #ncopies grabbed n from [1, 2, 3]'s length (3)
#default argument expressions are arbitrary expressions
def my_sum(lst):
cur_sum = lst[0]
for i in lst[1:]: cur_sum += i
return cur_sum
def bar(b=my_sum((["b"] * (2 * 3))[:4])):
print b
>>> bar()
bbbb
#default argument expressions are re-evaluated at every call...
from random import randint
def baz(c=randint(1,3)):
print c
>>> baz()
2
>>> baz()
3
#...but only when they're required
def silly():
print "spam"
return 42
def qux(d=silly()):
pass
>>> qux()
spam
>>> qux(17)
>>> qux(d=17)
>>> qux(*[17])
>>> qux(**{'d':17})
>>> #no output since silly() never called
>>> #because d's value was specified in the calls
#default argument expressions are evaluated in calling sequence
order
count = 0
def next():
global count
count += 1
return count - 1
def frobnicate(g=next(), h=next(), i=next()):
print g, h, i
>>> frobnicate()
0 1 2
>>> #g, h, and i's default argument expressions are evaluated
>>> #in the same order as in the parameter definition
#variables in default expressions refer to lexical/global
variables...
j = "holy grail"
def frenchy(k=j):
print j
#...unless assigned to in the function (or its parameters)
def arthur(j="swallow", m=j):
print m
>>> frenchy()
holy grail
>>> arthur()
swallow
Compatibility Issues
This change in semantics breaks code which uses mutable default
argument
expressions and depends on those expressions being evaluated only once.
It also will break code that assigns new incompatible values in a
parent
scope to variables used in default expressions. Code relying on such
behavior can be refactored from:
def foo(bar=mutable):
#code
to
state = mutable
def foo(bar=state):
#code
or
class Baz(object):
state = mutable
@classmethod
def foo(cls, bar=cls.state):
#code
or
from functools import wraps
def stateify(states):
def _wrap(func):
@wraps(func)
def _wrapper(*args, **kwds):
new_kwargs = states.copy()
new_kwargs.update(kwds)
return func(*args, **new_kwargs)
return _wrapper
return _wrap
@stateify({'bar' : mutable})
def foo(bar):
#code
Code such as the following (which was also mentioned in the Rationale):
b = 42 #outer b
def foo(a=b): #ERROR: refers to local b, not outer b!
b = 7 #local b
which has default values that refer to variables in enclosing
scopes and
contains assignments to local variables of the same names will also be
incompatible, as the 'b' in the default argument refers to the
local 'b'
rather than the outer 'b', resulting in an UnboundLocalError
because the
local variable 'b' has not been assigned to at the time "a"'s default
expression is evaluated. Such code will need to rename the affected
variables.
The changes in this PEP are backwards-compatible with all code whose
default argument values are immutable, including code using the idiom
mentioned in the 'Motivation' section. However, such values will
now be
recomputed for each call for which they are required. This may cause
performance degradation. If such recomputation is significantly
expensive, the same refactoring mentioned above can be used.
A survey of the standard library for Python v2.5, produced via a
script [7], gave the following statistics for the standard library
(608 files, test suites were excluded):
total number of non-None immutable default arguments: 1585 (41.5%)
total number of mutable default arguments: 186 (4.9%)
total number of default arguments with a value of None: 1813
(47.4%)
total number of default arguments with unknown mutability: 238
(6.2%)
total number of comparisons to None: 940
Note: The number of comparisons to None refers to *all* such
comparisons,
not necessarily just those used in the idiom mentioned in the
Motivation
section.
Looking more closely at the script's output, it appears that Tix.py and
Tkinter.py are the primary users of mutable default arguments in the
standard library.
Similarly, examination of the unknown default arguments reveals that a
significant fraction are functions, classes, or constants, which
should, for
the most part, not be functionally affected by this proposal
Assuming the standard library is indicative of Python code in
general, the
change in semantics will have comparatively little impact on the
correct
operation of Python programs.
Running pybench with modifications to simulate the proposed
semantics [8]
shows that Python function/method calls using default arguments run
about
4.4%-6.5% slower versus the current semantics. However, as the
simulation
of the proposed semantics is crude, this should be considered an upper
bound for any performance decreases this proposal might cause.
In relation to Python 3.0, this PEP's proposal is compatible with
those of
PEP 3102 [9] and PEP 3107 [10], though it does not depend on the
acceptance of either of those PEPs.
Reference Implementation
All code of the form:
def foo(bar=some_expr, baz=other_expr):
#body
Should be compiled as if it had read (in pseudo-Python):
def foo(bar=_undefined, baz=_undefined):
if bar is _undefined:
bar = some_expr
if baz is _undefined:
baz = other_expr
#body
where '_undefined' is the value given to a parameter when the caller
didn't specify a value for it. This is not intended to be a literal
translation, but rather a demonstration as to how Python's
argument-handling machinery should act. Specifically, there should
be no
Python-level value corresponding to _undefined, nor should a literal
translation such as that shown necessarily be used.
References
[0] 10 Python pitfalls
http://zephyrfalcon.org/labs/python_pitfalls.html
[1] Python Gotchas
http://www.ferg.org/projects/python_gotchas.html#contents_item_6
[2] When Pythons Attack
http://www.onlamp.com/pub/a/python/2004/02/05/learn_python.html?page=2
[3] 4. More Control Flow Tools
http://docs.python.org/tut/node6.html#SECTION006710000000000000000
[4] [Python-ideas] fixing mutable default argument values
http://mail.python.org/pipermail/python-ideas/2007-January/000073.html
[5] [Python-ideas] proto-PEP: Fixing Non-constant Default Arguments
http://mail.python.org/pipermail/python-ideas/2007-January/000121.html
[6] A Switch/Case Statement
http://www.python.org/dev/peps/pep-3103/
[7] Script to generate default argument statistics
See attachment.
[8] Patch to pybench/Calls.py
See attachment.
[9] Keyword-Only Arguments
http://www.python.org/dev/peps/pep-3102/
[10] Function Annotations
http://www.python.org/dev/peps/pep-3107/
Copyright
This document has been placed in the public domain.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: defargs.diff
Type: text/x-patch
Size: 794 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070213/12125965/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: new_find.py
Type: text/x-python
Size: 4245 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070213/12125965/attachment.py
More information about the Python-3000
mailing list