efficient running median

Janto Dreijer jantod at gmail.com
Tue Oct 13 20:30:53 EDT 2009


On Oct 13, 6:12 pm, Peter Otten <__pete... at web.de> wrote:
> Janto Dreijer wrote:
> > I'm looking for code that will calculate the running median of a
> > sequence, efficiently. (I'm trying to subtract the running median from
> > a signal to correct for gradual drift).
>
> > My naive attempt (taking the median of a sliding window) is
> > unfortunately too slow as my sliding windows are quite large (~1k) and
> > so are my sequences (~50k). On my PC it takes about 18 seconds per
> > sequence. 17 of those seconds is spent in sorting the sliding windows.
>
> > I've googled around and it looks like there are some recent journal
> > articles on it, but no code. Any suggestions?
>
> If you aren't using numpy, try that. Showing your code might also be a good
> idea...
>
> Peter

I placed the test code and its output here:
http://bitbucket.org/janto/snippets/src/tip/running_median.py

I also have a version that uses numpy. On random data it seems to be
about twice as fast as the pure python one. It spends half the time
sorting and the other half converting the windows from python lists to
numpy arrays.
If the data is already sorted, the version using python's builtin sort
outperforms the numpy convert-and-sort by about 5 times. Strangely
satisfying :)

I'm hoping to find a trick to do better than scipy.signal.medfilt.
Looking at its code (PyArray_OrderFilterND in
http://svn.scipy.org/svn/scipy/trunk/scipy/signal/sigtoolsmodule.c)
there may be hope. It looks like they're sorting the entire window
instead of doing a QuickSelect (which I could do with STL's
nth_element).

Janto



More information about the Python-list mailing list