
(my response is a bit late, I needed some time to come up with a good answer to your objections) On Tue, 30 Jan 2007 16:48:54 +0100, Greg Falcon <veloso@verylowsodium.com> wrote:
On 1/30/07, Jan Kanis <jan.kanis@phil.uu.nl> wrote:
On the other hand, are there really any good reasons to choose the current semantics of evaluation at definition time?
While I sympathize with the programmer that falls for this common Python gotcha, and would not have minded if Python's semantics were different from the start (though the current behavior is cleaner and more consistent), making such a radical change to such a core part of the language semantics now is a very bad idea for many reasons.
It would be a py 3.0 change. Other important stuff is going to change as well. This part of python is IMO not that much part of the core that it can't change at all. Especially since the overwhelming majority of all uses of default args have immutable values, so their behaviour isn't going to change anyway. (judging by the usage in the std lib.) Things like list comprehension and generators were a much greater change to python, drastically changing the way an idiomatic python program is written. They were added in 2.x because they could be implementen backward compatible. With python 3.0, backward compatibility isn't so important anymore. The whole reason for python 3.0's existance is to fix backward incompatible stuff.
What I've heard basically boils down to two arguments: - "let's not change anything", i.e. resist change because it is change, which I don't think is a very pythonic argument.
The argument here is not "let's not change anything because it's change," but rather "let's not break large amounts of existing code without a very good reason." As has been stated here by others, making obsolete a common two-line idiom is not a compelling enough reason to do so.
py3k is going to break large ammounts of code anyway. This pep certainly won't break the most of it. And there's gonna be an automatic py2 -> py3 refactoring tool, that can catch any possible breakage from this pep as well.
Helping out beginning Python programmers, while well-intentioned, doesn't feel like enough of a motivation either. Notice that the main challenge for the novice programmer is not to learn how default arguments work -- novices can learn to recognize and write the idiom easily enough -- but rather to learn how variables and objects work in general. [snip] At some point in his Python career, a novice is going to have to understand why b "changed" but d didn't. Fixing the default argument "wart" doesn't remove the necessity to understand the nature of mutable objects and variable bindings in Python; it just postpones the problem. This is a fact worth keeping in mind when deciding whether the sweeping change in semantics is worth the costs.
The change was never intended to prevent newbies from learning about pythons object model. There are other ways to do that. But keeping a 'wart' because newbies will learn from it seems like really bad reasoning, language-design wise.
- Arguments based on the assumption that people actually do make lots of use of the fact that default arguments are shared between function invocations, many of which will result in (much) more code if it has to be transformed to using one of the alternative idioms. If this is true, it is a valid argument. I guess there's still some stdlib grepping to do to decide this.
Though it's been decried here as unPythonic, I can't be the only person who uses the idiom def foo(..., cache={}): for making a cache when the function in question does not rise to the level of deserving to be a class object instead. I don't apologize for finding it less ugly than using a global variable.
How often do you use this compared to the x=None idiom? This idiom is really going to be the only idiom that's going to break. There are many ways around it, I wouldn't mind an @cache(var={}) decorator somewhere (perhaps in the stdlib). These kind of things seem to be exactly what decorators are good at.
I know I'm not the only user of the idiom because I didn't invent it -- I learned it from the Python community. And the fact that people have already found usages of the current default argument behavior in the standard library is an argument against the "unPythonic" claim.
I'm reminded of GvR's post on what happened when he made strings non-iterable in a local build (iterable strings being another "wart" that people thought needed fixing): http://mail.python.org/pipermail/python-3000/2006-April/000824.html
In that thread, Guido is at first in favour of making strings non-iterable, one of the arguments being that it sometimes bites people who expect e.g. a list of strings and get a string. He decides not to make the change because there appear to be a number of valid use cases that are hard to change, and the number of people actually getting bitten by it is actually quite small. (To support that last part, note for example that none of the 'python problems' pages listed in the pep talk about string iteration while all talk about default arguments, some with dire warnings and quite a bit of text.) In the end, the numbers are going to be important. There seems to be only a single use case in favour of definition time semantics for default variables (caching), which isn't very hard to do in a different way. Though seasoned python programmers don't get bitten by default args all the time, they have to work around it all the time using =None. If it turns out that people are actually using caching and other idioms that require definition time semantics all the time, and the =None idiom is used only very rarely, I'd be all in favour of rejecting this pep.
So, are there any _other_ arguments in favour of the current semantics??
Yes. First, consistency.
[factoring out the first argument into another email. It's taking me some effort to get my head around the early/late binding part of the generator expressions pep, and the way you find an argument in that. As far as I understand it currently, either you or I do not understand that part of the pep correctly. I'll try to get this mail out somewhere tomorrow]
Second, the a tool can't fix all usages of the old idiom. When things break, they can break in subtle or confusing ways. Consider my module "greeter":
== begin greeter.py == import sys def say_hi(out = sys.stdout): print >> out, "Hi!" del sys # don't want to leak greeter.sys to the outside world == end greeter.py ==
Nothing I've done here is strange or unidiomatic, and yet your proposed change breaks it, and it's unclear how an automated tool should fix it.
Sure this can be fixed by a tool: import sys @caching(out = sys.stdout) def say_hi(out): print >> out, "Hi!" del sys where the function with the 'caching' wrapper checks to see if an argument for 'out' is provided, or else provides it itself. The caching(out = sys.stdout) is actually a function _call_, so it's sys.stdout gets evaluated immediately. possible implementation of caching: def caching(**cachevars): def inner(func): def wrapper(**argdict): for var in cachevars: if not var in argdict: argdict[var] = cachevars[var] return func(**argdict) return wrapper return inner Defining a decorator unfortunately requires three levels of nested functions, but apart from that the thing is pretty straightforward, and it only needs to be defined once to use on every occasion of the caching idiom. It doesn't currently handle positional vars, but that can be added.
What's worse about the breakage is that it doesn't break when greeter is imported,
That's true of any function with a bug in it. Do you want to abandon functions alltogether?
or even when greeter.say_hi is called with an argument.
Currently for people using x=None, if x=None <calculate default value>, this difference is a branch in the code. That's why you need to test _all_ possible branches in your unit test. Analagously you need to test all combinations of arguments if you want to catch as many bugs as possible.
It might take a while before getting a very surprising error "global name 'sys' is not defined".
However, your greeter module actually has a slight bug. What if I do this: import sys, greeter sys.stdout = my_output_proxy() greeter.say_hi() Now say_hi() still uses the old sys.stdout, which is most likely not what you want. If greeter were implemented like this: import sys as _sys def say_hi(out = _sys.stdout): print >> out, "Hi!" under the proposed semantics, it would all by itself do a late binding of _sys.stdout, so when I change sys.stdout somewhere else, say_hi uses the new stdout. Deleting sys in order not to 'leak' it to any other module is really not useful. Everybody knows that python does not actually enforce encapsulation, nor does it provide any kind of security barriers between modules. So if some other module wants to get at sys it can get there anyway, and if you want to indicate that sys isn't exporters and greeter's sys shouldn't be messed around with, the renaming import above does that just fine.
Third, the old idiom is less surprising.
def foo(x=None): if x is None: x=<some_expr>
<some_expr> may take arbitrarily long to complete. It may have side effects. It may throw an exception. It is evaluated inside the function call, but only evaluated when the default value is used (or the function is passed None).
There is nothing surprising about any of that. Now:
def foo(x=<some_expr>): pass
Everything I said before applies. The expression can take a long time, have side effects, throw an exception. It is conditionally evaluated inside the function call.
Only now, all of that is terribly confusing and surprising (IMO).
Read the "what's new in python 3.0" (assuming the pep gets incorporated, of course). Exception tracebacks and profiler stats will point you at the right line, and you will figure it out. As you said above, all of it is true under the current =None idiom, so there are no totally new ways in which a program can break. If you know the ways current python can break (take too long, unwanted side effects, exceptions) you will figure it out in the new version. Anyway, many python newbies consider it confusing and surprising that an empty list default value doesn't stay empty, and all other pythoneers have to work around it a lot of times. It will be a pretty unique python programmer whose program will break in ways mentioned above by the default expression being evaluated at call time, and wouldn't have broken under python's current behaviour, and who isn't able to figure out what happened in a reasonable amout of time. So even if your argument holds, it will still be a net win to accept the pep.
Greg F _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
- Jan