Speed revisited

Andrea Griffini agriff at tin.it
Mon Jan 10 21:51:48 CET 2005


On Mon, 10 Jan 2005 17:52:42 +0100, Bulba! <bulba at bulba.com> wrote:

>I don't see why should deleting element from a list be O(n), while
>saying L[0]='spam' when L[0] previously were, say, 's', not have the
>O(n) cost, if a list in Python is just an array containing the 
>objects itself?
>
>Why should JUST deletion have an O(n) cost?

Because after deletion L[1] moved to L[0], L[2] moved to L[1],
L[3] moved to L[2] and so on. To delete the first element you
have to move n-1 pointers and this is where O(n) comes from.
When you reassign any element there is no need to move the
others around, so that's why you have O(1) complexity.

With a data structure slightly more complex than an array
you can have random access in O(1), deletion of elements
O(1) at *both ends* and insertion in amortized O(1) at
*both ends*. This data structure is called doubly-ended
queque (nickname "deque") and is available in python.

The decision was that for the basic list object the overhead
added by deques for element access (it's still O(1), but a
bit more complex that just bare pointer arithmetic) and, I
guess, the hassle of changing a lot of working code and
breaking compatibility with extensions manipulating directly
lists (no idea if such a thing exists) was not worth the gain.

The gain would have been that who doesn't know what O(n)
means and that uses lists for long FIFOs would get fast
programs anyway without understanding why. With current
solution they just have to use deques instead of lists.

After thinking to it for a while I agree that this is a
reasonable choice. The gain is anyway IMO very little
because if a programmer desn't understand what O(n) is
then the probability that any reasonably complex program
written is going to be fast is anyway zero... time would
just be wasted somewhere else for no reason.

Andrea



More information about the Python-list mailing list