restriction on sum: intentional bug?
Mark Dickinson
dickinsm at gmail.com
Sat Oct 17 16:32:32 EDT 2009
On Oct 17, 3:27 pm, a... at pythoncraft.com (Aahz) wrote:
> In article <0062f568$0$26941$c3e8... at news.astraweb.com>,
> Steven D'Aprano <st... at REMOVE-THIS-cybersource.com.au> wrote:
> > (For the record, summing lists is O(N**2), and unlike strings, there's no
> > optimization in CPython to avoid the slow behaviour.)
>
> Are you sure?
The O(N**2) claim surprised me too, but it certainly looks that
way. Here's a script to produce some timings:
def concat1(list_of_lists):
return sum(list_of_lists, [])
def concat2(list_of_lists):
acc = []
for l in list_of_lists:
acc = acc + l
return acc
def concat3(list_of_lists):
acc = []
for l in list_of_lists:
acc += l
return acc
def concat4(list_of_lists):
acc = []
for l in list_of_lists:
acc.extend(l)
return acc
test_list = [[i] for i in xrange(100000)]
from timeit import Timer
for fn in ["concat1", "concat2", "concat3", "concat4"]:
t = Timer(fn + "(test_list)", "from __main__ import test_list, " +
fn)
print fn, t.timeit(number=3)/3.0
On my machine (OS X 10.5/Core 2 Duo), under Python 2.7 svn I get:
newton:trunk dickinsm$ ./python.exe ~/time_list_sum.py
concat1 48.1459733645
concat2 48.4200883706
concat3 0.0146766503652
concat4 0.0184679826101
For some reason that I don't really understand, the CPython source
does
the equivalent of concat2 instead of concat3. See the builtin_sum
function in
http://svn.python.org/view/python/trunk/Python/bltinmodule.c?view=markup
and scroll past the special cases for ints and floats. After a one-
line
source change, replacing the PyNumber_Add call with
PyNumber_InPlaceAdd,
I get the following results:
newton:trunk dickinsm$ ./python.exe ~/time_list_sum.py
concat1 0.0106019973755
concat2 48.0212899844
concat3 0.0138022899628
concat4 0.0179653167725
--
Mark
More information about the Python-list
mailing list