[Python-ideas] fixing mutable default argument values

Sat Jan 27 01:36:22 CET 2007

On Thu, 25 Jan 2007 15:41:54 +0100, Jim Jewett <jimjjewett at gmail.com>  
wrote:

> On 1/24/07, Jan Kanis <jan.kanis at phil.uu.nl> wrote:
>> I don't like new syntax for something like this, but I think the default
>> argument values can be fixed with semantic changes (which should not  
>> break
>> the most common current uses):
>>
>> What I think should happen is compile a function like this
>>
>> def popo(x=[]):
>>      x.append(666)
>>      print x
>>
>> as if it had read
>>
>> def popo(x=__default_argument_marker__):
>>      if x == __default_argument_marker__:
>>          x = []
>>      x.append(666)
>>      print x
>
> How is this different from the x=None idiom of today?
>
>     def f(inlist=None):
>         if inlist is None:
>             inlist=[]
>

The __default_argument_marker__ is not really a part of my proposal. You  
can replace it with None everywhere if you want to. The reason I used it  
is because using None can clash when the caller passes None in as explicit  
value like this:

def foo(x, y=None):
   y = getApropreateDefaultValue() if y == None else y
   x.insert(y)

foo(bar, None)

Now if you want to have foo do bar.insert(None), calling foo(bar, None)  
won't work.

However, I guess the risk of running into such a case in real code is  
neglegible, and my urge to write correct code won from writing  
understandable code. Just pretend I used None everywhere. (and that the  
compiler can magically distinguish between a default-argument None and a  
caller-provided None, if you wish.)

> The if (either 2 lines or against PEP8) is a bit ugly, but Calvin
> pointed out that you can now write it as
>
>     def f(inlist=None):
>         inlist = inlist if (inlist is not None) else []
>
> I see  below that you give it slightly different semantics, but I'm
> not entirely sure how to tell when those different semantics should
> apply (always?  when the variable name is marked with __*__?  When a
> specific non-None singleton appears?), or why you would ever want
> them.

Please just ignore the __default_argument_marker__ thing. I hope we agree  
that the problem we're trying to solve is that while

def f(inlist=None):
   inlist = inlist if (inlist is not None) else []

works in the current python, it's non-intuitive and ugly, and it would be  
nice to have python do 'the right thing' if we can find a nice way to make  
it do that.

Oh, and the changed semantics would allways be uses.

>> When you use variables as default value instead of literals, I think  
>> most
>> of the time you intend to have the function do something to the same
>> object the variable is bound to, instead of the function creating it's  
>> own
>> copy every time it's called. This behaviour still works with these
>> semantics:
>
>> >>> a = []
>> >>>def foo(x=[[],a]):
>> >>>   x[0].append(123)
>> >>>   x[1].append(123)
>> >>>   print x
>> >>>foo()
>> [[123], [123]]
>> >>> foo()
>> [[123], [123, 123]]
>> >>> foo()
>> [[123], [123, 123, 123]]
>
> So you're saying that x[1] should be persistent because it (also) has
> a name (as 'a'), but x[0] should be recreated fresh on each call
> because it doesn't?

I think what python does currently can be improved upon. I think that if  
someone defines a function like def f(x=[]): ... he'll most likely want x  
to be an empty list every time the function is called. But just having  
python do an automagic copy(x) or deepcopy(x) is not going to work, becaus  
it can be expensive, isn't always nescessary, and is sometimes plainly  
impossible, eg: def f(x=sys.stdout): ...

So, sometimes we want a new thing on every call, and sometimes we don't.  
And sometimes we want to specify the default value with a literal, and  
sometimes with a variable. My assumption is that these two differences  
coincide most of the time. I also think my approach is more intuitive to  
people new to python and not familiar with this quirk/wart.

You make it sound as if doing something different with a named variable  
and a literal is strange, but this is exactly what happens in every normal  
python expression:

  >>> a = []
  >>> b = [[], a]
  >>> id(a)
  12976416
  >>> id(b[0])
  12994640
  >>> id(b[1])
  12976416
  >>> # let's execute the b = ... statement again (comparable to 'call a  
function again')
  >>> b = [[], a]
  >>> id(b[0])
  12934800
  >>> id(b[1])
  12976416

b[0] gets recreated, while b[1] is not.

So, I think my solution of evaluating the default argument on every call  
and letting any variables in the expression be closure variables  
accomplishes everything we want:
* most of the time python does 'the right thing' wrt to copying or not  
copying, and does this in a way that is very regular wrt the rest of the  
language (and therefore easy to learn)
* if we don't want a copy while python would do this, that can easily be  
accomplished by first creating the object and then using the variable name  
that references the object as default
* if we do want a copy, just do an explicit copy.copy(x).

On Fri, 26 Jan 2007 04:36:33 +0100, Chris Rebert <cvrebert at gmail.com>  
wrote:

> So, basically the same as my proposal except without syntax changes and  
> with the <Foo()> default argument value semantics applied to all  
> arguments. I'm okay with this, however some possible performance issues  
> might exist with re-evaluating expensive default arg vals on every call  
> where they're required. This is basically why my proposal required new  
> syntax, so that people could use the old "eval once at definition-time"  
> semantics on expensive default values to get better performance.
> [snip]
> If the performance issues aren't significant, I'm all for your proposal.  
>   It'd be nice to not to have to add new syntax.

Well, I wasn't thinking about it as basically the same as your proposal,  
but thinking about it again I think it is. (I was thinking more along the  
lines of having this do what looks intuitive to me, by applying the normal  
python language rules in the IMO 'right' way.)

on performance issues:

If you're using the x=None ... if x==None: x = [] trick the object gets  
evaluated and recreated on every call anyway, so there's no change.
If you aren't using a literal as default, nothing gets re-evaluated, so no  
problem either.
The only time it could become a problem is with code like this:

def foo(x=createExpensiveThing()):
	return x.bar()

If the function modifies x, you'll probably want to use a fresh x every  
call anyway (using x=None). If you do want the x to be persistant, chances  
are you already have a reference to it somewhere, but if you don't you'll  
have to create one. The reference doesn't get re-evaluated, so there's no  
performance issue, only possibly a namespace clutter issue.
If the function doesn't modify x it may give rise to a performance issue,  
but that can easily be solved by creating the default before defining the  
function and using the variable. Rejecting this proposition because of  
this seems like premature optimisation to me.

Another slight performance loss may be the fact that variables used in a  
default value will sometimes become closure variables, which are slightly  
slower than locals. However these variables are not used in the functions  
body, and it only is an issue if we're defining a function inside another  
function. I think this point is neglegible.

- Jan