
On 1/30/07, Jan Kanis <jan.kanis@phil.uu.nl> wrote:
On the other hand, are there really any good reasons to choose the current semantics of evaluation at definition time?
While I sympathize with the programmer that falls for this common Python gotcha, and would not have minded if Python's semantics were different from the start (though the current behavior is cleaner and more consistent), making such a radical change to such a core part of the language semantics now is a very bad idea for many reasons.
What I've heard basically boils down to two arguments: - "let's not change anything", i.e. resist change because it is change, which I don't think is a very pythonic argument.
The argument here is not "let's not change anything because it's change," but rather "let's not break large amounts of existing code without a very good reason." As has been stated here by others, making obsolete a common two-line idiom is not a compelling enough reason to do so. Helping out beginning Python programmers, while well-intentioned, doesn't feel like enough of a motivation either. Notice that the main challenge for the novice programmer is not to learn how default arguments work -- novices can learn to recognize and write the idiom easily enough -- but rather to learn how variables and objects work in general.
a=b=['foo'] c=d=42 a+=['bar'] c+=1 b ['foo', 'bar'] d 42
At some point in his Python career, a novice is going to have to understand why b "changed" but d didn't. Fixing the default argument "wart" doesn't remove the necessity to understand the nature of mutable objects and variable bindings in Python; it just postpones the problem. This is a fact worth keeping in mind when deciding whether the sweeping change in semantics is worth the costs.
- Arguments based on the assumption that people actually do make lots of use of the fact that default arguments are shared between function invocations, many of which will result in (much) more code if it has to be transformed to using one of the alternative idioms. If this is true, it is a valid argument. I guess there's still some stdlib grepping to do to decide this.
Though it's been decried here as unPythonic, I can't be the only person who uses the idiom def foo(..., cache={}): for making a cache when the function in question does not rise to the level of deserving to be a class object instead. I don't apologize for finding it less ugly than using a global variable. I know I'm not the only user of the idiom because I didn't invent it -- I learned it from the Python community. And the fact that people have already found usages of the current default argument behavior in the standard library is an argument against the "unPythonic" claim. I'm reminded of GvR's post on what happened when he made strings non-iterable in a local build (iterable strings being another "wart" that people thought needed fixing): http://mail.python.org/pipermail/python-3000/2006-April/000824.html
So, are there any _other_ arguments in favour of the current semantics??
Yes. First, consistency. What do the three following Python constructs have in common? 1) lambda x=foo(): None 2) (x for x in foo()) 3) def bar(x=foo()): pass Answer: all three evaluate foo() immediately, choosing not to defer the evaluation to when the resulting object is invoked, even though they all reasonably could. It's especially notable that the recently-added feature (generator expressions) follows existing precedent. This was not accidental, but rather a considered design decision. Two paragraphs from PEP 289 could apply equally well to your proposal: | Various use cases were proposed for binding all free variables when | the generator is defined. And some proponents felt that the resulting | expressions would be easier to understand and debug if bound | immediately. | However, Python takes a late binding approach to lambda expressions | and has no precedent for automatic, early binding. It was felt that | introducing a new paradigm would unnecessarily introduce complexity. In fact, the situation here is worse. PEP 289 is arguing against early binding of free variables as being complex. You're not proposing an early binding, but rather a whole new meaning of the "=" token, "save this expression for conditional evaluation later." It's never meant anything like that before. Second, the a tool can't fix all usages of the old idiom. When things break, they can break in subtle or confusing ways. Consider my module "greeter": == begin greeter.py == import sys def say_hi(out = sys.stdout): print >> out, "Hi!" del sys # don't want to leak greeter.sys to the outside world == end greeter.py == Nothing I've done here is strange or unidiomatic, and yet your proposed change breaks it, and it's unclear how an automated tool should fix it. What's worse about the breakage is that it doesn't break when greeter is imported, or even when greeter.say_hi is called with an argument. It might take a while before getting a very surprising error "global name 'sys' is not defined". Third, the old idiom is less surprising. def foo(x=None): if x is None: x=<some_expr> <some_expr> may take arbitrarily long to complete. It may have side effects. It may throw an exception. It is evaluated inside the function call, but only evaluated when the default value is used (or the function is passed None). There is nothing surprising about any of that. Now: def foo(x=<some_expr>): pass Everything I said before applies. The expression can take a long time, have side effects, throw an exception. It is conditionally evaluated inside the function call. Only now, all of that is terribly confusing and surprising (IMO). Greg F