Re: [Python-ideas] fixing mutable default argument values

Jan. 27, 2007

      On Thu, 25 Jan 2007 15:41:54 +0100, Jim Jewett <jimjjewett@gmail.com>  
wrote:
...
On 1/24/07, Jan Kanis <jan.kanis@phil.uu.nl> wrote:
...
I don't like new syntax for something like this, but I think the default
argument values can be fixed with semantic changes (which should not  
break
the most common current uses):
What I think should happen is compile a function like this
def popo(x=[]):
     x.append(666)
     print x
as if it had read
def popo(x=__default_argument_marker__):
     if x == __default_argument_marker__:
         x = []
     x.append(666)
     print x
How is this different from the x=None idiom of today?
def f(inlist=None):
        if inlist is None:
            inlist=[]
The __default_argument_marker__ is not really a part of my proposal. You  
can replace it with None everywhere if you want to. The reason I used it  
is because using None can clash when the caller passes None in as explicit  
value like this:

def foo(x, y=None):
   y = getApropreateDefaultValue() if y == None else y
   x.insert(y)

foo(bar, None)

Now if you want to have foo do bar.insert(None), calling foo(bar, None)  
won't work.

However, I guess the risk of running into such a case in real code is  
neglegible, and my urge to write correct code won from writing  
understandable code. Just pretend I used None everywhere. (and that the  
compiler can magically distinguish between a default-argument None and a  
caller-provided None, if you wish.)
...
The if (either 2 lines or against PEP8) is a bit ugly, but Calvin
pointed out that you can now write it as
def f(inlist=None):
        inlist = inlist if (inlist is not None) else []
I see  below that you give it slightly different semantics, but I'm
not entirely sure how to tell when those different semantics should
apply (always?  when the variable name is marked with __*__?  When a
specific non-None singleton appears?), or why you would ever want
them.
Please just ignore the __default_argument_marker__ thing. I hope we agree  
that the problem we're trying to solve is that while

def f(inlist=None):
   inlist = inlist if (inlist is not None) else []

works in the current python, it's non-intuitive and ugly, and it would be  
nice to have python do 'the right thing' if we can find a nice way to make  
it do that.

Oh, and the changed semantics would allways be uses.
...
...
When you use variables as default value instead of literals, I think  
most
of the time you intend to have the function do something to the same
object the variable is bound to, instead of the function creating it's  
own
copy every time it's called. This behaviour still works with these
semantics:
...
...
...
...
a = []
def foo(x=[[],a]):
  x[0].append(123)
  x[1].append(123)
  print x
foo()
[[123], [123]]
foo()
[[123], [123, 123]]
foo()
[[123], [123, 123, 123]]
So you're saying that x[1] should be persistent because it (also) has
a name (as 'a'), but x[0] should be recreated fresh on each call
because it doesn't?
I think what python does currently can be improved upon. I think that if  
someone defines a function like def f(x=[]): ... he'll most likely want x  
to be an empty list every time the function is called. But just having  
python do an automagic copy(x) or deepcopy(x) is not going to work, becaus  
it can be expensive, isn't always nescessary, and is sometimes plainly  
impossible, eg: def f(x=sys.stdout): ...

So, sometimes we want a new thing on every call, and sometimes we don't.  
And sometimes we want to specify the default value with a literal, and  
sometimes with a variable. My assumption is that these two differences  
coincide most of the time. I also think my approach is more intuitive to  
people new to python and not familiar with this quirk/wart.

You make it sound as if doing something different with a named variable  
and a literal is strange, but this is exactly what happens in every normal  
python expression:
...
...
...
a = []
b = [[], a]
id(a)
  12976416
id(b[0])
  12994640
id(b[1])
  12976416
# let's execute the b = ... statement again (comparable to 'call a  
function again')
b = [[], a]
id(b[0])
  12934800
id(b[1])
  12976416
b[0] gets recreated, while b[1] is not.

So, I think my solution of evaluating the default argument on every call  
and letting any variables in the expression be closure variables  
accomplishes everything we want:
* most of the time python does 'the right thing' wrt to copying or not  
copying, and does this in a way that is very regular wrt the rest of the  
language (and therefore easy to learn)
* if we don't want a copy while python would do this, that can easily be  
accomplished by first creating the object and then using the variable name  
that references the object as default
* if we do want a copy, just do an explicit copy.copy(x).

On Fri, 26 Jan 2007 04:36:33 +0100, Chris Rebert <cvrebert@gmail.com>  
wrote:
...
So, basically the same as my proposal except without syntax changes and  
with the <Foo()> default argument value semantics applied to all  
arguments. I'm okay with this, however some possible performance issues  
might exist with re-evaluating expensive default arg vals on every call  
where they're required. This is basically why my proposal required new  
syntax, so that people could use the old "eval once at definition-time"  
semantics on expensive default values to get better performance.
[snip]
If the performance issues aren't significant, I'm all for your proposal.  
  It'd be nice to not to have to add new syntax.
Well, I wasn't thinking about it as basically the same as your proposal,  
but thinking about it again I think it is. (I was thinking more along the  
lines of having this do what looks intuitive to me, by applying the normal  
python language rules in the IMO 'right' way.)

on performance issues:

If you're using the x=None ... if x==None: x = [] trick the object gets  
evaluated and recreated on every call anyway, so there's no change.
If you aren't using a literal as default, nothing gets re-evaluated, so no  
problem either.
The only time it could become a problem is with code like this:

def foo(x=createExpensiveThing()):
	return x.bar()

If the function modifies x, you'll probably want to use a fresh x every  
call anyway (using x=None). If you do want the x to be persistant, chances  
are you already have a reference to it somewhere, but if you don't you'll  
have to create one. The reference doesn't get re-evaluated, so there's no  
performance issue, only possibly a namespace clutter issue.
If the function doesn't modify x it may give rise to a performance issue,  
but that can easily be solved by creating the default before defining the  
function and using the variable. Rejecting this proposition because of  
this seems like premature optimisation to me.

Another slight performance loss may be the fact that variables used in a  
default value will sometimes become closure variables, which are slightly  
slower than locals. However these variables are not used in the functions  
body, and it only is an issue if we're defining a function inside another  
function. I think this point is neglegible.

- Jan