[Python-Dev] Fwd: summing a bunch of numbers (or "whatevers")

Alex Martelli aleax@aleax.it
Mon, 21 Apr 2003 17:03:24 +0200


On Monday 21 April 2003 02:48 pm, Guido van Rossum wrote:
> OK, let me summarize and pronounce.
>
> sum(sequence_of_strings) is out.  *If* "".join() is really too ugly (I
> still think it's a matter of getting used to, like indentation), we

I entirely agree on this.  Differently from reduce(operator.add, XX),
''.join(XX) *CAN* be taught quite reasonably to bright beginners
without any special math/CS background, in my experience.  The
noise against ''.join IMHO comes mostly from a crowd of "OO
purists" who just don't see WHY it's RIGHT for it to be that way!-)

> could add join(seq, delim) as a built-in.  VB has one. :-)

VB has lots of stuff, but we don't need this one.  Please.  One
obvious way to do it (at least if you are Dutch...!).


> sum([]) could either return 0 or raise ValueError.  I lean towards 0
> because that is occasionally useful and reinforces the numeric
> intention.  I think making it return 0 will prevent end-case bugs
> where a newbie sums a list that is occasionally empty.  If we made it
> an error, I expect that in 99% of the cases the response to that error
> would be to change the program to make it return 0 if the list is
> empty, and I can't imagine many bugs caused by choosing 0 over some
> other numerical zero.  Having to teach the idiom sum(S or [0]) is
> ugly, and this doesn't work if S is an iterator.

You're right that S or [0] doesn't work for iterators, AND that bright
beginners expect 0 rather than an error (fortunately I have some of
those at hand to check with;-).  So, sum([])==0 it is.


> I appreciate Tim's point of wanting to sum "number-like" objects that
> can't be added to 0.  OTOH if we provide *any* way of providing a
> different starting point, some creative newbie is going to use
> sum(list_of_strings, "") instead of "".join(), and be hurt by the
> performance months later.

Yes yes yes!


> If we add an optional argument for Tim's use case, it could be used in
> two different ways: (1) only when the sequence is empty, (2) always
> used as a starting point.  IMO (2) is more useful and more consistent.
>
> Here's one suggestion to deal with the sequence_of_strings issue
> (though maybe too pedantic): explicitly check whether the second
> argument is a string or unicode object, and in that case raise a
> TypeError indicating that a numeric value is required and suggesting
> to use "".join() for summing a sequence of strings.

I like this!!!


> So here's a strawman implementation:
>
>   def sum(seq, start=0):
>     if isinstance(start, basestring):
>       raise TypeError, "can't sum strings; use ''.join(seq) instead"
>     return reduce(operator.add, seq, start)
>
> Alex, go ahead and implement this!

Coming right up!


Alex