Linked lists
Lulu of the Lotus-Eaters
mertz at gnosis.cx
Wed Aug 8 07:04:08 CEST 2001
|You can't? It's clear enough. The average person doesn't dump thousands
|of items into a link list. That's what it takes in order for Python to "see" the
|O(1) complexity benefits that linked lists offer.
My admiration of Python lists has increased even futher today. Like
most naive programmers (naive still after a couple decades of
programming), I figured that Python lists couldn't possibly be a very
good data structure if "insert in the middle" is a common operation.
After all, one has to copy all the tail elements over a position for
every insert, which makes the insertion O(N). A linked list can save
all the copy operations.
With that in mind, I tried a little test:
--------------
import time, sys
class LinkedList: ... # below
if __name__ == '__main__':
LOOPS = int(sys.argv[1])
start = time.time()
lst = []
for i in xrange(LOOPS):
mid = i/2
lst = lst[:mid] + [None] + lst[mid:]
end = time.time()
print "%d Python concatenations in %5.2f seconds" % (LOOPS, end-start)
start = time.time()
lst = []
for i in xrange(LOOPS):
mid = i/2
new = lst[:mid]
new.append(None)
new.extend(lst[mid:])
lst = new
end = time.time()
print "%d Python List inserts in %5.2f seconds" % (LOOPS, end-start)
start = time.time()
lst = LinkedList(1)
for i in xrange(LOOPS):
mid = i/2
lst[mid] = None
end = time.time()
print "%d LinkedList inserts in %5.2f seconds" % (LOOPS, end-start)
--------------
The first two loops are basically the same, and just use Python lists and
slices in some obvious ways. The first approach seemed like the "obvious" way
to do it, but it turns out to be noticably worse than the second way. But how
much worse seems close to a constant multiple, or maybe a very low complexity
order difference.
The third loop does absolutely terrible for the test case. The problem, of
course, is that insertion is O(1), but walking to the correct position for
insertion is still O(N). Of course, if finding the insertion position is
cheap, so is the overall loop. For example, substituting 'mid=0' or 'mid=i' in
the above loops has little effect on the first two loops, but makes the
LinkedList loop almost free.
With that, my quick LinkedList code. It is pure Python, which is most of what
makes it very slow. But the overall complexity issues would apply to an
extension module. If there is anything I am missing in making my class
efficient, I would not be surprised, and would be interested to learn it.
Obviously, only a few of the interesting methods are implemented (__getslice__
and others would be worth adding if the class was worth having in the first
place).
--------------
class LinkedList:
def __init__(self, *items):
self.links = []
self.pos = 0
self.current = self.links
for item in items:
self.current.extend([item,[]])
self.current = self.current[1]
self.current = self.links
self.val = self.links[0]
def next(self):
self.pos += 1
self.current = self.current[1]
self.val = self.current[0]
return self.val
def start(self):
self.pos = 0
self.current = self.links
def __getitem__(self, index):
if self.pos > index:
self.pos = 0
self.current = self.links
for i in xrange(self.pos, index):
self.next()
return self.val
def __setitem__(self, index, val):
self.__getitem__(index)
self.current[1] = self.current[:]
self.current[0] = val
self.pos += 1
--------------
Yours, Lulu...
More information about the Python-list
mailing list