timing puzzle
Neil Cerutti
horpner at yahoo.com
Fri Nov 16 14:18:10 EST 2007
On 2007-11-16, Robin Becker <robin at reportlab.com> wrote:
> I'm trying to get my head round a timing puzzle I find whilst
> optimizing A Kuchling's version of the Knuth line splitting
> algorithm from the oedipus project. The puzzle is as follows in
> highly abstract form (here active_nodes is a list of objects
> kept in a sorted order, but there are no special methods on the
> objects for comparison etc)
>
> for i in xrange(m):
> .......
> for A in active_nodes[:]:
> ......
> if cond:
> active_nodes.remove(A)
> ......
> .....
>
> is slow my test takes 2.7 seconds; profiling reveals we're
> calling A.remove thousands of times and it's taking a lot of
> time. There appear to be no other usages of active_nodes in the
> for A loop.
>
> On a hunch I changed the above to
>
> removed = []
> aremoved = removed.append
> for i in xrange(m):
> .......
> for iA,A in enumerate(active_nodes):
> ......
> if cond:
> aremoved(iA)
> ......
> if removed:
'if removed' is reduntant, as the loop body below will not be
executed if removed is empty.
> for iA in reversed(removed):
> del active_nodes[iA]
> del removed[:]
> ......
>
> this latter is 5 times faster, but I can't really see why.
You are no longer making m copies of active_nodes.
> My only guess is that appending to the removed list is
> relatively easy compared with deletes and the reverse order
> delete is somehow moving less data about. I think in the
> original Knuth the deletion would have been from some real kind
> of list and O(1) whereas here deletions are O(n/2) on average.
> On the other hand we have to keep this list in sorted order.
> What data structure should I be using?
When you have to make many deletions from the middle of a
sequence, you would normally choose a linked list. Python doesn't
provide much support for linked lists, unfortunately.
Instead, filter your list. It looks like you can't use filter
directly, so just do it manually.
for i in xrange(m):
.......
saved_nodes = []
for A in active_nodes[:]:
......
if not cond:
saved_nodes.append(A)
......
active_nodes = saved_nodes
.....
--
Neil Cerutti
Sermon Outline: I. Delineate your fear II. Disown your fear III. Displace your
rear --Church Bulletin Blooper
More information about the Python-list
mailing list