[Python-ideas] fixing mutable default argument values

Tue Jan 30 16:48:54 CET 2007

On 1/30/07, Jan Kanis <jan.kanis at phil.uu.nl> wrote:
> On the other hand, are there really any good reasons to choose the current
> semantics of evaluation at definition time?

While I sympathize with the programmer that falls for this common
Python gotcha, and would not have minded if Python's semantics were
different from the start (though the current behavior is cleaner and
more consistent), making such a radical change to such a core part of
the language semantics now is a very bad idea for many reasons.

> What I've heard basically
> boils down to two arguments:
> - "let's not change anything", i.e. resist change because it is change,
> which I don't think is a very pythonic argument.

The argument here is not "let's not change anything because it's
change," but rather "let's not break large amounts of existing code
without a very good reason."  As has been stated here by others,
making obsolete a common two-line idiom is not a compelling enough
reason to do so.

Helping out beginning Python programmers, while well-intentioned,
doesn't feel like enough of a motivation either.  Notice that the main
challenge for the novice programmer is not to learn how default
arguments work -- novices can learn to recognize and write the idiom
easily enough -- but rather to learn how variables and objects work in
general.

>>> a=b=['foo']
>>> c=d=42
>>> a+=['bar']
>>> c+=1
>>> b
['foo', 'bar']
>>> d
42

At some point in his Python career, a novice is going to have to
understand why b "changed" but d didn't.  Fixing the default argument
"wart" doesn't remove the necessity to understand the nature of
mutable objects and variable bindings in Python; it just postpones the
problem.  This is a fact worth keeping in mind when deciding whether
the sweeping change in semantics is worth the costs.

> - Arguments based on the assumption that people actually do make lots of
> use of the fact that default arguments are shared between function
> invocations, many of which will result in (much) more code if it has to be
> transformed to using one of the alternative idioms. If this is true, it is
> a valid argument. I guess there's still some stdlib grepping to do to
> decide this.

Though it's been decried here as unPythonic, I can't be the only
person who uses the idiom
def foo(..., cache={}):
for making a cache when the function in question does not rise to the
level of deserving to be a class object instead.  I don't apologize
for finding it less ugly than using a global variable.

I know I'm not the only user of the idiom because I didn't invent it
-- I learned it from the Python community.  And the fact that people
have already found usages of the current default argument behavior in
the standard library is an argument against the "unPythonic" claim.

I'm reminded of GvR's post on what happened when he made strings
non-iterable in a local build (iterable strings being another "wart"
that people thought needed fixing):
http://mail.python.org/pipermail/python-3000/2006-April/000824.html

> So, are there any _other_ arguments in favour of the current semantics??

Yes.  First, consistency.  What do the three following Python
constructs have in common?

1) lambda x=foo(): None
2) (x for x in foo())
3) def bar(x=foo()):
     pass

Answer: all three evaluate foo() immediately, choosing not to defer
the evaluation to when the resulting object is invoked, even though
they all reasonably could.

It's especially notable that the recently-added feature (generator
expressions) follows existing precedent.  This was not accidental, but
rather a considered design decision.  Two paragraphs from PEP 289
could apply equally well to your proposal:

| Various use cases were proposed for binding all free variables when
| the generator is defined. And some proponents felt that the resulting
| expressions would be easier to understand and debug if bound
| immediately.

| However, Python takes a late binding approach to lambda expressions
| and has no precedent for automatic, early binding. It was felt that
| introducing a new paradigm would unnecessarily introduce complexity.

In fact, the situation here is worse.  PEP 289 is arguing against
early binding of free variables as being complex.  You're not
proposing an early binding, but rather a whole new meaning of the "="
token, "save this expression for conditional evaluation later."  It's
never meant anything like that before.

Second, the a tool can't fix all usages of the old idiom.  When things
break, they can break in subtle or confusing ways.  Consider my module
"greeter":

== begin greeter.py ==
import sys
def say_hi(out = sys.stdout):
 print >> out, "Hi!"
del sys # don't want to leak greeter.sys to the outside world
== end greeter.py ==

Nothing I've done here is strange or unidiomatic, and yet your
proposed change breaks it, and it's unclear how an automated tool
should fix it.  What's worse about the breakage is that it doesn't
break when greeter is imported, or even when greeter.say_hi is called
with an argument.  It might take a while before getting a very
surprising error "global name 'sys' is not defined".

Third, the old idiom is less surprising.

def foo(x=None):
 if x is None:
   x=<some_expr>

<some_expr> may take arbitrarily long to complete.  It may have side
effects.  It may throw an exception.  It is evaluated inside the
function call, but only evaluated when the default value is used (or
the function is passed None).

There is nothing surprising about any of that.  Now:

def foo(x=<some_expr>):
 pass

Everything I said before applies.  The expression can take a long
time, have side effects, throw an exception.  It is conditionally
evaluated inside the function call.

Only now, all of that is terribly confusing and surprising (IMO).

Greg F