Efficient Running Median

Raymond Hettinger python at rcn.com
Fri Jan 15 20:15:03 CET 2010


I've updated the running median recipe to use a new algorithm with O
(log n) updates for a large sliding window traversing a data stream.
See http://code.activestate.com/recipes/576930/

The engine is a new collection class called IndexableSkiplist.  It is
like a regular skiplist as detailed at http://en.wikipedia.org/wiki/Skip_list,
but it also records the width of each link field.  That allows values
to be retrieved by their position index in O(log n) time.

The key  operations are:
O(log n) --   sl.insert(value) # add a value to the skiplist,
maintaining sorted order
O(log n) --   s.remove(value)  # remove a value from the skiplist,
maintaining sorted order
O(log n) --   s[i]             # retrieve the i-th item
O(n)     --   list(sl)         # iterate over the skiplist in sorted
order
O(1)     --   len(sl)          # number of items in the skiplist

The performance of an IndexableSkiplist is similar to a B+tree but the
implementation in pure python is much simpler.


Raymond




More information about the Python-list mailing list