
On Sat, Dec 5, 2009 at 10:23, Vitor Bosshard <algorias@gmail.com> wrote:
And in that case the special string handling could also be dropped?
sum(["a","b"], "start") Traceback (most recent call last): File "<pyshell#0>", line 1, in <module> sum(["a","b"], "start") TypeError: sum() can't sum strings [use ''.join(seq) instead]
This behaviour is quite bothersome. Sum can handle arbitrary objects in theory (as long as they define the correct special methods, etc.), but it gratuitously raises an exception on strings. This behaviour is also inconsistent with the following:
sum(["a","b"]) Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> sum(["a","b"]) TypeError: unsupported operand type(s) for +: 'int' and 'str'
Where sum actually tries to add "a" to the default value of 0.
sum is defined by repeatedly adding each number in a sequence. As each number is usually constant, and the size of total grows logarithmically, this is O(n log n) (but due to implementation coarseness it usually isn't distinguished from O(n)). Concatenation however grows the total's size very quickly. You instead get a performance of O(n**2). Same result, wrong algorithm. It would be possible to special case strings, but why? The programmer should know what algorithm they're using and what complexity class it has, so they can pick the right one (''.join(seq) in this case). IOW, handling arbitrary objects is an illusion. For an another example on why the programmer needs to understand the algorithmic complexity of the operations they're using, and that the language should value performance consistency and not just correct output, see ABC's usage of rational numbers: http://python-history.blogspot.com/2009/03/problem-with-integer-division.htm... -- Adam Olsen, aka Rhamphoryncus