[Tutor] Question about O(N**2)

Steven D'Aprano steve at pearwood.info
Sun May 4 06:13:12 CEST 2014


On Sat, May 03, 2014 at 03:59:40PM -0700, Danny Yoo wrote:
> Following up on this.  Let's make sure that we're talking about the same thing.
> 
> 
> The assertion is that the following:
> 
>     fullPath += [...]
> 
> where fullPath is a list of strings, exhibits O(n^2) time.  I don't
> think this is true.  Semantically, the statement above should be
> equivalent to:
> 
>    fullPath.append(...)
> 
> and so I agree with David: this is not O(n^2), but rather O(n).

Danny is correct. For built-in lists, += augmented assignment is 
implemented using the extend method, not the + operator. Consequently, 
repeatedly calling += modifies the list in place, growing it as needed, 
which is amortized O(N) rather than O(N**2).

Had += been implemented as + instead, as I initially mis-remembered, the 
behaviour would have been O(N**2). The process would have gone:

fullPath = []
fullPath = [] + [a]  # add a new path component
fullPath = [a] + [b]
fullPath = [a, b] + [c]
fullPath = [a, b, c] + [d]
fullPath = [a, b, c, d] + [e]
...

Each individual + is an O(N+M) operation (if there are ten items in the 
first list, and five items in the second list, adding them needs to copy 
fifteen items). With M a constant 1, we can take each addition as O(N). 
And there are N additions, so we get O(N*N) or O(N**2).

We can test this by seeing how long it takes to perform a bunch of 
additions. For timing, I'm going to use the recipe shown here:

http://code.activestate.com/recipes/577896-benchmark-code-with-the-with-statement/

First, using augmented assignment for lists:

py> size = 1000000
py> with Timer():
...     data = []
...     for i in range(size):
...             data += [i]
...
time taken: 0.548546 seconds
py> with Timer():
...     data = []
...     for i in range(3*size):
...             data += [i]
...
time taken: 1.672449 seconds


By increasing the amount of data by a factor of three, the time taken 
also increases by about a factor of three. That is O(N).

Now let's try list addition. It's so much slower than += that I had to 
reduce the size by a lot just to have it complete in any reasonable 
amount of time:

py> size = 10000
py> with Timer():
...     data = []
...     for i in range(size):
...             data = data + [i]
...
time taken: 0.331893 seconds
py> with Timer():
...     data = []
...     for i in range(3*size):
...             data = data + [i]
...
time taken: 3.203508 seconds


Here, increasing N by a factor of 3 takes about 10 times longer, which 
is consistent with O(N**2).



-- 
Steven


More information about the Tutor mailing list