[Python-ideas] fixing mutable default argument values
Jan Kanis
jan.kanis at phil.uu.nl
Thu Feb 1 04:33:09 CET 2007
(my response is a bit late, I needed some time to come up with a good
answer to your objections)
On Tue, 30 Jan 2007 16:48:54 +0100, Greg Falcon <veloso at verylowsodium.com>
wrote:
> On 1/30/07, Jan Kanis <jan.kanis at phil.uu.nl> wrote:
>> On the other hand, are there really any good reasons to choose the
>> current
>> semantics of evaluation at definition time?
>
> While I sympathize with the programmer that falls for this common
> Python gotcha, and would not have minded if Python's semantics were
> different from the start (though the current behavior is cleaner and
> more consistent), making such a radical change to such a core part of
> the language semantics now is a very bad idea for many reasons.
It would be a py 3.0 change. Other important stuff is going to change as
well. This part of python is IMO not that much part of the core that it
can't change at all. Especially since the overwhelming majority of all
uses of default args have immutable values, so their behaviour isn't going
to change anyway. (judging by the usage in the std lib.)
Things like list comprehension and generators were a much greater change
to python, drastically changing the way an idiomatic python program is
written. They were added in 2.x because they could be implementen backward
compatible. With python 3.0, backward compatibility isn't so important
anymore. The whole reason for python 3.0's existance is to fix backward
incompatible stuff.
>> What I've heard basically
>> boils down to two arguments:
>> - "let's not change anything", i.e. resist change because it is change,
>> which I don't think is a very pythonic argument.
>
> The argument here is not "let's not change anything because it's
> change," but rather "let's not break large amounts of existing code
> without a very good reason." As has been stated here by others,
> making obsolete a common two-line idiom is not a compelling enough
> reason to do so.
py3k is going to break large ammounts of code anyway. This pep certainly
won't break the most of it. And there's gonna be an automatic py2 -> py3
refactoring tool, that can catch any possible breakage from this pep as
well.
> Helping out beginning Python programmers, while well-intentioned,
> doesn't feel like enough of a motivation either. Notice that the main
> challenge for the novice programmer is not to learn how default
> arguments work -- novices can learn to recognize and write the idiom
> easily enough -- but rather to learn how variables and objects work in
> general.
[snip]
> At some point in his Python career, a novice is going to have to
> understand why b "changed" but d didn't. Fixing the default argument
> "wart" doesn't remove the necessity to understand the nature of
> mutable objects and variable bindings in Python; it just postpones the
> problem. This is a fact worth keeping in mind when deciding whether
> the sweeping change in semantics is worth the costs.
The change was never intended to prevent newbies from learning about
pythons object model. There are other ways to do that. But keeping a
'wart' because newbies will learn from it seems like really bad reasoning,
language-design wise.
>> - Arguments based on the assumption that people actually do make lots of
>> use of the fact that default arguments are shared between function
>> invocations, many of which will result in (much) more code if it has to
>> be
>> transformed to using one of the alternative idioms. If this is true, it
>> is
>> a valid argument. I guess there's still some stdlib grepping to do to
>> decide this.
>
> Though it's been decried here as unPythonic, I can't be the only
> person who uses the idiom
> def foo(..., cache={}):
> for making a cache when the function in question does not rise to the
> level of deserving to be a class object instead. I don't apologize
> for finding it less ugly than using a global variable.
How often do you use this compared to the x=None idiom?
This idiom is really going to be the only idiom that's going to break.
There are many ways around it, I wouldn't mind an @cache(var={}) decorator
somewhere (perhaps in the stdlib). These kind of things seem to be exactly
what decorators are good at.
> I know I'm not the only user of the idiom because I didn't invent it
> -- I learned it from the Python community. And the fact that people
> have already found usages of the current default argument behavior in
> the standard library is an argument against the "unPythonic" claim.
>
> I'm reminded of GvR's post on what happened when he made strings
> non-iterable in a local build (iterable strings being another "wart"
> that people thought needed fixing):
> http://mail.python.org/pipermail/python-3000/2006-April/000824.html
In that thread, Guido is at first in favour of making strings
non-iterable, one of the arguments being that it sometimes bites people
who expect e.g. a list of strings and get a string. He decides not to make
the change because there appear to be a number of valid use cases that are
hard to change, and the number of people actually getting bitten by it is
actually quite small. (To support that last part, note for example that
none of the 'python problems' pages listed in the pep talk about string
iteration while all talk about default arguments, some with dire warnings
and quite a bit of text.)
In the end, the numbers are going to be important. There seems to be only
a single use case in favour of definition time semantics for default
variables (caching), which isn't very hard to do in a different way.
Though seasoned python programmers don't get bitten by default args all
the time, they have to work around it all the time using =None.
If it turns out that people are actually using caching and other idioms
that require definition time semantics all the time, and the =None idiom
is used only very rarely, I'd be all in favour of rejecting this pep.
>
>> So, are there any _other_ arguments in favour of the current semantics??
>
> Yes. First, consistency.
[factoring out the first argument into another email. It's taking me some
effort to get my head around the early/late binding part of the generator
expressions pep, and the way you find an argument in that. As far as I
understand it currently, either you or I do not understand that part of
the pep correctly. I'll try to get this mail out somewhere tomorrow]
> Second, the a tool can't fix all usages of the old idiom. When things
> break, they can break in subtle or confusing ways. Consider my module
> "greeter":
>
> == begin greeter.py ==
> import sys
> def say_hi(out = sys.stdout):
> print >> out, "Hi!"
> del sys # don't want to leak greeter.sys to the outside world
> == end greeter.py ==
>
> Nothing I've done here is strange or unidiomatic, and yet your
> proposed change breaks it, and it's unclear how an automated tool
> should fix it.
Sure this can be fixed by a tool:
import sys
@caching(out = sys.stdout)
def say_hi(out):
print >> out, "Hi!"
del sys
where the function with the 'caching' wrapper checks to see if an argument
for 'out' is provided, or else provides it itself. The caching(out =
sys.stdout) is actually a function _call_, so it's sys.stdout gets
evaluated immediately.
possible implementation of caching:
def caching(**cachevars):
def inner(func):
def wrapper(**argdict):
for var in cachevars:
if not var in argdict:
argdict[var] = cachevars[var]
return func(**argdict)
return wrapper
return inner
Defining a decorator unfortunately requires three levels of nested
functions, but apart from that the thing is pretty straightforward, and it
only needs to be defined once to use on every occasion of the caching
idiom.
It doesn't currently handle positional vars, but that can be added.
> What's worse about the breakage is that it doesn't
> break when greeter is imported,
That's true of any function with a bug in it. Do you want to abandon
functions alltogether?
> or even when greeter.say_hi is called
> with an argument.
Currently for people using x=None, if x=None <calculate default value>,
this difference is a branch in the code. That's why you need to test _all_
possible branches in your unit test. Analagously you need to test all
combinations of arguments if you want to catch as many bugs as possible.
> It might take a while before getting a very
> surprising error "global name 'sys' is not defined".
However, your greeter module actually has a slight bug. What if I do this:
import sys, greeter
sys.stdout = my_output_proxy()
greeter.say_hi()
Now say_hi() still uses the old sys.stdout, which is most likely not what
you want. If greeter were implemented like this:
import sys as _sys
def say_hi(out = _sys.stdout):
print >> out, "Hi!"
under the proposed semantics, it would all by itself do a late binding of
_sys.stdout, so when I change sys.stdout somewhere else, say_hi uses the
new stdout.
Deleting sys in order not to 'leak' it to any other module is really not
useful. Everybody knows that python does not actually enforce
encapsulation, nor does it provide any kind of security barriers between
modules. So if some other module wants to get at sys it can get there
anyway, and if you want to indicate that sys isn't exporters and greeter's
sys shouldn't be messed around with, the renaming import above does that
just fine.
> Third, the old idiom is less surprising.
>
> def foo(x=None):
> if x is None:
> x=<some_expr>
>
> <some_expr> may take arbitrarily long to complete. It may have side
> effects. It may throw an exception. It is evaluated inside the
> function call, but only evaluated when the default value is used (or
> the function is passed None).
>
> There is nothing surprising about any of that. Now:
>
> def foo(x=<some_expr>):
> pass
>
> Everything I said before applies. The expression can take a long
> time, have side effects, throw an exception. It is conditionally
> evaluated inside the function call.
>
> Only now, all of that is terribly confusing and surprising (IMO).
Read the "what's new in python 3.0" (assuming the pep gets incorporated,
of course).
Exception tracebacks and profiler stats will point you at the right line,
and you will figure it out. As you said above, all of it is true under the
current =None idiom, so there are no totally new ways in which a program
can break. If you know the ways current python can break (take too long,
unwanted side effects, exceptions) you will figure it out in the new
version.
Anyway, many python newbies consider it confusing and surprising that an
empty list default value doesn't stay empty, and all other pythoneers have
to work around it a lot of times. It will be a pretty unique python
programmer whose program will break in ways mentioned above by the default
expression being evaluated at call time, and wouldn't have broken under
python's current behaviour, and who isn't able to figure out what happened
in a reasonable amout of time. So even if your argument holds, it will
still be a net win to accept the pep.
>
>
> Greg F
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
- Jan
More information about the Python-ideas
mailing list