Appending to []

Sun Apr 22 04:29:32 EDT 2012

On 2012-04-22, Steven D'Aprano wrote:
> On Sat, 21 Apr 2012 14:48:44 +0200, Bernd Nawothnig wrote:
>
>> On 2012-04-20, Rotwang wrote:
>>> since a method doesn't assign the value it returns to the instance on
>>> which it is called; what it does to the instance and what it returns
>>> are two completely different things.
>> 
>> Returning a None-value is pretty useless. Why not returning self, which
>> would be the resulting list in this case? Returning self would make the
>> language a little bit more functional, without any drawback.
>
> It is a deliberate design choice, and there would be a drawback.
>
> A method like append could have three obvious designs:
>
> 1) Functional, no side-effects: return a new list with the item appended.
>
> 2) Functional, with side-effect: return the same list, after appending 
> the item.
>
> 3) Procedural, with side-effect: append the item, don't return anything 
> (like a procedure in Pascal, or void in C).

Correct.

> Python chooses 3) as the design, as it is the cleanest, most pure choice 
> for a method designed to operate by side-effect. Unfortunately, since 
> Python doesn't have procedures, that clean design is slightly spoilt due 
> to the need for append to return None (instead of not returning anything 
> at all).
>
> How about 1), the pure functional design? The downside of that is the 
> usual downside of functional programming -- it is inefficient to 
> duplicate a list of 100 million items just to add one more item to that 
> list.

In general I always prefer the pure functional approach. But you are
right, if it is too costly, one has to weigh the pros and contras.

> Besides, if you want a pure functional append operation, you can 
> simply use mylist + [item] instead.

That ist true. I will keep that in mind :-)

> But what about 2), the mixed (impure) functional design? Unfortunately, 
> it too has a failure mode: by returning a list, it encourages the error 
> of assuming the list is a copy rather than the original:
>
> mylist = [1, 2, 3, 4]
> another_list = mylist.append(5)
> # many pages of code later...
> do_something_with(mylist)

Yes, but mutable data is in general a candidate for unexpected
behaviour, regardless wether you use an impure functional notation or
not:

mylist = [1, 2, 3, 4]
mylist.append(5)
another_list = mylist
# many pages of code later...
do_something_with(mylist)

avoids that impure function call but can perfectly lead to the same
unexpected behaviour. Your "many pages of code later" and that it is
simply difficult or impossible to keep in mind all these possible
state changes of variables is the real problem here.

> This is especially a pernicious error because instead of giving an 
> exception, your program will silently do the wrong thing. 
>
>     "I find it amusing when novice programmers believe their main
>     job is preventing programs from crashing. More experienced
>     programmers realize that correct code is great, code that 
>     crashes could use improvement, but incorrect code that doesn’t 
>     crash is a horrible nightmare."
>     -- Chris Smith

Absolutely corrrect!

> Debugging these sorts of bugs can become very difficult, and design 2) is 
> an attractive nuisance: it looks good because you can chain appends:
>
> mylist.append(17).append(23).append(42)
> # but why not use mylist.extend([17, 23, 42]) instead?
>
> but the disadvantage in practice far outweighs the advantage in theory.
>
> This is the same reason why list.sort, reverse and others also return 
> None.

Yeah, understood.

>> Then nested calls like
>> 
>> a = [].append('x').append('y').append('z')
>> 
>> would be possible with a containing the resulting list
>> 
>> ['x', 'y', 'z'].
>> 
>> That is the way I expect any append to behave.
>
> That would be possible, but pointless. Why not use:
>
> a = ['x', 'y', 'z'] 
>
> directly instead of constructing an empty list and then make three 
> separate method calls? Methods which operate by side-effect but return 
> self are an attractive nuisance: they seem like a good idea but actually 
> aren't, because they encourage the user to write inefficient, or worse, 
> incorrect, code.

In the past I often wrote methods that returned self instead of void,
None, or Nil depending on the used language. 

But your arguments against that are not bad.

Thanks!

Instead of thinking about impure designs I should dig deeper into
Haskell :-)

Bernd

-- 
"Die Antisemiten vergeben es den Juden nicht, dass die Juden Geist
haben - und Geld." [Friedrich Nietzsche]