On Sun, May 10, 2009 at 11:23 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, 10 May 2009 10:19:01 am Tennessee Leeuwenburg wrote:
> Hi Pascal,
> Taking the example of
>
> def foo(bar = []):
>   bar.append(4)
>   print(bar)
>
> I'm totally with you in thinking that what is 'natural' is to expect
> to get a new, empty, list every time.

That's not natural to me. I would be really, really surprised by the
behaviour you claim is "natural":

>>> DEFAULT = 3
>>> def func(a=DEFAULT):
...     return a+1
...
>>> func()
4
>>> DEFAULT = 7
>>> func()
8

Good example! If I may translate that back into the example using a list to make sure I've got it right...

default = []

def func(a=default):
   a.append(5)

func()
func()

default will now be [5,5]
 
For deterministic functions, the same argument list should return the
same result each time. By having default arguments be evaluated every
time they are required, any function with a default argument becomes
non-deterministic. Late evaluation of defaults is, essentially,
equivalent to making the default value a global variable. Global
variables are rightly Considered Harmful: they should be used with
care, if at all.


If I can just expand on that point somewhat...

In the example I gave originally, I had in mind someone designing a function, whereby it could be called either with some pre-initialised term, or otherwise it would use a default value of []. I imagined a surprised designer finding that the default value of [] was a pointer to a specific list, rather than a new empty list each time.

e.g.

def foo(bar = []):
    bar.append(5)
    return bar

The same argument list (i.e. no arguments) would result in a different result being returned every time. On the first call, bar would be [5], then [5,5], then [5,5,5]; yet the arguments passed (i.e. none, use default) would not have changed.

You have come up with another example. I think it is designed to illustrate that a default argument doesn't need to specify a default value for something, but could be a default reference (such as a relatively-global variable). In that case, it is modifying something above its scope. To me, that is what you would expect under both "ways of doing things". I wonder if I am missing your point...

I'm totally with you on the Global Variables Are Bad principle, however. I don't design them in myself, and where I have worked with them, usually they have just caused confusion.

> However this isn't want
> happens. As far as I'm concerned, that should more or less be the end
> of the discussion in terms of what should ideally happen.

As far as I'm concerned, what Python does now is the idea behaviour.
Default arguments are part of the function *definition*, not part of
the body of the function. The definition of the function happens
*once* -- the function isn't recreated each time you call it, so
default values shouldn't be recreated either.

I agree that's how you see things, and possibly how many people see things, but I don't accept that it is a more natural way of seeing things. However, what *I* think is more natural is just one person's viewpoint... I totally see the philosophical distinction you are trying to draw, and it certainly does help to clarify why things are the way they are. However, I just don't know that it's the best way they could be.

> The responses to the change in behaviour which I see as more natural
> are, to summarise, as follows:
>   -- For all sorts of technical reasons, it's too hard
>   -- It changes the semantics of the function definition being
> evaluated at compile time
>   -- It's not what people are used to

And it's not what many people want.

You only see the people who complain about this feature. For the
multitude of people who expect it or like it, they have no reason to
say anything (except in response to complaints). When was the last time
you saw somebody write to the list to say "Gosh, I really love that
Python uses + for addition"? Features that *just work* never or rarely
get mentioned.

:) ... well, that's basically true. Of course there are some particular aspects of Python which are frequently mentioned as being wonderful, but I see your point.  However, I'm not sure we really know one way or another about what people want then -- either way.
 
> With regards to the second point, it's not like the value of
> arguments is set at compile time, so I don't really see that this
> stands up.

I don't see what relevance that has. If the arguments are provided at
runtime, then the default value doesn't get used.

I think this is the fundamental difference -- to me speaks worlds :) ... I think you just have a different internal analogy for programming than I do. That's fine. To me, I don't see that a line of code should not be dynamically evaluated just because it's part of the definition. I just don't see why default values shouldn't be (or be able to be) dynamically evaluated. Personally I think that doing it all the time is more natural, but I certainly don't see why allowing the syntax would be bad. I'd basically do that 100% of the time. I'm not sure I've ever used a default value other than None in a way which I wouldn't want dynamically evaluated.
 
> I don't think it's intuitive,

Why do you think that intuitiveness is more valuable than performance
and consistency?

Because I like Python more than C? I'm pretty sure everyone here would agree than in principle, elegance of design and intuitive syntax are good. Agreeing on what that means might involve some robust discussion, but I think everyone would like the same thing. Well, consistency is pretty hard to do without... :)
 
Besides, intuitiveness is a fickle thing. Given this pair of functions:

def expensive_calculation():
   time.sleep(60)
   return 1

def useful_function(x=expensive_calculation()):
   return x + 1

I think people would be VERY surprised that calling useful_function()
with no arguments would take a minute *every time*, and would complain
that this slowness was "unintuitive".

That seems at first like a good point. It is a good point, but I don't happen to side with you on this issue, although I do see that many people might. The code that I write is not essentially performance-bound. It's a lot more design-bound (by which I mean it's very complicated, and anything I can do to simplify it is well-worth a bit of a performance hit).

However, when the design options are available (setting aside what default behaviour should be), it's almost always possible to design things how you'd like them.

e.g.

def speed_critical_function(x=None):
    if x is None:
       time.sleep(60:

    return 1

def handy_simple_function(foo=5, x=[]): or maybe (foo=5, x = new []):
    for i in range(5):
        x.append(i)

    return x

Then, thinking about it a little more (and bringing back a discussion of default behaviour), I don't really see why the implementation of the dynamic function definition would be any slower than using None to indicate it wasn't passed in, followed by explicit default-value setting.

 
> it's just that people become
> accustomed to it. There is indeed, *some sense* in understanding that
> the evaluation occurs at compile-time, but there is also a lot of
> sense (and in my opinion, more sense) in understanding the evaluation
> as happening dynamically when the function is called.

No. The body of the function is executed each time the function is
called. The definition of the function is executed *once*, at compile
time. Default arguments are part of the definition, not the body, so
they too should only be executed once. If you want them executed every
time, put them in the body:

def useful_function(x=SENTINEL):
   if x is SENTINEL:
       x = expensive_calculation()
   return x+1

I agree that's how things *are* done, but I just don't see why it should be that way, beyond it being what people are used to. It seems like there is no reason why it would be difficult to implement CrazyPython which does things as I suggest. Given that, it also doesn't seem like there is some inherent reason to prefer the design style of RealActualPython over CrazyPython. Except, of course that RealActualPython exists and I can use it right now (thanks developers!), versus CrazyPython which is just an idea.


> With regards to the first point, I'm not sure that this is as
> significant as all of that, although of course I defer to the
> language authors here. However, it seems as though it could be no
> more costly than the lines of code which most frequently follow to
> initialise these variables.
>
> On the final point, that's only true for some people. For a whole lot
> of people, they stumble over it and get it wrong. It's one of the
> most un-Pythonic things which I have to remember about Python when
> programming -- a real gotcha.


I accept that it is a Gotcha. The trouble is, the alternative behaviour
you propose is *also* a Gotcha, but it's a worse Gotcha, because it
leads to degraded performance, surprising introduction of global
variables where no global variables were expected, and a breakdown of
the neat distinction between creating a function and executing a
function.

But as for it being un-Pythonic, I'm afraid that if you really think
that, your understanding of Pythonic is weak. From the Zen:

The Zen of Python, by Tim Peters

Special cases aren't special enough to break the rules.
Although practicality beats purity.
If the implementation is hard to explain, it's a bad idea.

(1) Assignments outside of the body of a function happen once, at
compile time. Default values are outside the body of the function. You
want a special case for default values so that they too happen at
runtime. That's not special enough to warrant breaking the rules.

Your logic is impeccable :) ... yet, if I may continue to push my wheelbarrow uphill for a moment longer, I would argue that is an implementation detail, not a piece of design philosophy.
 
(2) The potential performance degradation of re-evaluating default
arguments at runtime is great. For practical reasons, it's best to
evaluate them once only.

Maybe that's true. I guess I have two things to say on that point. The first is that I'm still not sure that's really true in a problematic way. Anyone wanting efficiency could continue to use sentinel  values of None (which obviously don't need to be dynamically evaluated) while other cases would surely be no slower than the initialisation code would be anyway. Is the cost issue really that big a problem?

The other is that while pragmatics is, of course, a critical issue, it's also true that it's well-worth implementing more elegant language features if possible. It's always a balance. The fastest languages are always less 'natural', while the more elegant and higher-level languages somewhat slower. Where a genuine design improvement is found, I think it's worth genuinely considering including that improvement, even if it is not completely pragmatic.
 
(3) In order to get the behaviour you want, the Python compiler would
need a more complicated implementation which would be hard to explain.

Yes, that's almost certainly true.
 
> I don't see it as changing one way of
> doing things for another equally valid way of doing things, but
> changing something that's confusing and unexpected for something
> which is far more natural and, to me, Pythonic.

I'm sorry, while re-evaluation of default arguments is sometimes useful,
it's more often NOT useful. Most default arguments are simple objects
like small ints or None. What benefit do you gain from re-evaluating
them every single time? Zero benefit. (Not much cost either, for simple
cases, but no benefit.)

But for more complex cases, there is great benefit to evaluating default
arguments once only, and an easy work-around for those rare cases that
you do want re-evaluation.

Small ints and None are global pointers (presumably!) so there is no need to re-evaluate them every time. The list example is particularly relevant (ditto empty dictionary) since I think that would be one of the most common cases for re-evaluation. Presumably a reasonably efficient implementation could be worked out such that dynamic evaluation of the default arguments (and indeed the entire function definition) need only occur where a dynamic default value were included.

I agree that the workaround is not that big a deal once you're fully accustomed to How Things Work, but it just seems 'nicer' to allow dynamic defaults. That's all I really wanted to say in the first instance, I didn't think that position would really get anyone's back up.
 
Regards,
-Tennessee