restriction on sum: intentional misfeature?

Tim Chase python.list at tim.thechases.com
Sat Oct 17 05:15:55 EDT 2009


Carl Banks wrote:
> On Oct 16, 12:40 pm, Tim Chase <python.l... at tim.thechases.com> wrote:
>> Then I'm fine with sum() being smart enough to recognize this
>> horrid case and do the "right" thing by returning ''.join()
>> instead.
> 
> You don't want Python to get into this business.  Trust me.  Just
> don't go there.

Well python is already in this business of special cases -- it 
trys to be smart about a dumb operation by raising an error. 
Just call __add__ ... if it's slow, that's my problem as a 
programmer.  Python doesn't complain about lists, which Steven 
points out

   Steven D'Aprano wrote:
   >And indeed, if you pass a list-of-lists to sum(), it
   >does:
   >
   >>>> >>> sum([[1,2], ['a',None], [1,'b']], [])
   >[1, 2, 'a', None, 1, 'b']
   >
   >(For the record, summing lists is O(N**2), and unlike
   >strings, there's no optimization in CPython to avoid the
   >slow behaviour.)

which is also slow.  By your own words (from a subsequent email)

> If, say, you were to accept that Python is going to guard against a
> small number of especially bad cases, this has got to be one of the
> top candidates.

In guarding, you can do the intended thing (for strings, that's 
concatenation as the "+" operator does which can be optimized 
with ''.join()), or you can raise an error.  I don't see how 
using ''.join() is much different from being smart enough to 
raise an error, except it doesn't break user expectations.

> If you want sum to call ''.join transparently, then "".join would have
> to produce identical results to what sum() would have produced in all
> cases.  That does not happen.  If an object within the list defines
> both __str__ and __add__ methods, then "".join will call __str__,
> whereas sum would call __add__, leading to potentially different
> results.  Therefore, transparently substituting a call to "".join is
> not an option.

AFAICT, "".join() does not call __str__ on its elements:

   >>> ''.join(['hello', 42, 'world'])
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   TypeError: sequence item 1: expected string, int found
   >>> '__str__' in dir(42)
   True

which is exactly what I'd expect from

   >>> 'hello' + 42 + "world"
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   TypeError: cannot concatenate 'str' and 'int' objects

This is under 2.x (I don't have 3.x on hand to see if that 
changed unexpectedly)

> It'd be better to just remove the special case.

I'd be happy with either solution for summing strings.  Be slow 
or be fast, but don't be erroneous.

-tkc









More information about the Python-list mailing list