O(n^2) is bad - can it be fixed?
Alex Martelli
aleaxit at yahoo.com
Tue May 22 04:45:53 EDT 2001
"E. Mark Ping" <emarkp at CSUA.Berkeley.EDU> wrote in message
news:9ed20b$jfj$1 at agate.berkeley.edu...
> In article <3B09FBF9.8691EBFE at accessone.com>,
> Helen Dawson <helend at accessone.com> wrote:
> >List does much better. Why? It probably does the same thing that the
> >STL vector typically does - the allocabe space doubles whenever it
> >runs out of room, and then the newly allocated space is filled in
> >as needed.
>
> Doubtful. It most likely does what std::list<> does--it allocates
No need to wonder in a void about such issues... listobject.c is
open for inspection in the Objects directory of any Python source
distribution! A python list is nothing like a C++ std::list --
the latter is a doubly-linked-list (fast insertion/removal at
any point, but O(N) to get the Nth item), the former is actually
an array or vector.
> space each time a new entry is added. That's linear, not n^2 time.
> std::vector<> uses geometric allocation to get amortized constant
> allocation time.
>
> If you need to add on character at a time, choose the data structure
> that does what you want. Perhaps a list of char?
Testing is better than wondering:
import time, array
def AppendTest(listorarray):
head = repr(listorarray)
start = time.clock()
for i in range(0, 100):
for x in range(0, 2000):
listorarray.append(' ')
# print i
stend = time.clock()
print "%s: %.2f" % (head, stend-start)
AppendTest([])
AppendTest(array.array('c'))
D:\py21>python oaat.py
[]: 1.97
array('c'): 1.55
D:\py21>python oaat.py
[]: 1.97
array('c'): 1.56
So, an array.array of characters turns out to be
systematically and repeatably 20% faster than a
list for this task on my box (an old clunky P3-266
NT4SP5 at this time:-). Unsurprising:-) -- it
has less overhead since it keeps 1 byte per char
rather than 4 bytes per item (a reference) plus
space for the char itself.
By the way, once your built-one-char-at-a-time
array.array('c') object is ready, just call its
method .tostring() to get the string equivalent
(if you choose to use a list instead, then the
best way to get the string is ''.join(thelist)).
The dominance of array.array for this shines
out clearly from Guido's essay/anecdote on
optimization, btw...
Alex
More information about the Python-list
mailing list