sum for sequences?
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sun Mar 28 10:57:08 EDT 2010
On Sun, 28 Mar 2010 07:16:10 -0700, Steve Howell wrote:
> The mildly surprising part of sum() is that is does add vs. add-in-
> place, which leads to O(N) vs. O(1) for the inner loop calls, for
> certain data structures, notably lists, even though none of the
> intermediate results get used by the caller. For lists, you could make
> a more efficient variant of sum() that clones the start value and does
> add-in-place.
I have no doubt that you could make a version of sum for lists which is
more efficient than the existing one. After all, join more or less does
the same sort of thing, and it's very efficient. But don't think that add-
in-place is necessarily cheap. List appends are amortized O(1) each; if
you are adding M lists of N items each, that gives you O(M*N).
It's possible to improve the performance a tad if you can make multiple
appends in roughly constant time, which is what list.extend (probably?)
does, but only up to a point. Lists are over-allocated, but if you try to
add more items than there is room for, you need to make the list bigger,
and that means reallocating memory, which could easily be O(N**2) or
worse, depending on how good your OS's memory management is. Under Linux,
at least by default, malloc will never fail, but there's no guarantee how
long it will take to return. If the OOM killer has to start shutting down
other applications, and paging more and more memory to disk, eventually
malloc will return (or the system will core dump), but it could take a
while...
--
Steven
More information about the Python-list
mailing list