Programmatically finding "significant" data points

Roberto Bonvallet Roberto.Bonvallet at cern.ch
Tue Nov 14 09:38:37 EST 2006


erikcw wrote:
> I have a collection of ordered numerical data in a list.  The numbers
> when plotted on a line chart make a low-high-low-high-high-low (random)
> pattern.  I need an algorithm to extract the "significant" high and low
> points from this data.

In calculus, you identify high and low points by looking where the
derivative changes its sign.  When working with discrete samples, you can
look at the sign changes in finite differences:

>>> data = [...]
>>> diff = [data[i + 1] - data[i] for i in range(len(data))]
>>> map(str, diff)
['0.4', '0.1', '-0.2', '-0.01', '0.11', '0.5', '-0.2', '-0.2', '0.6',
'-0.1', '0.2', '0.1', '0.1', '-0.45', '0.15', '-0.3', '-0.2', '0.1',
'-0.4', '0.05', '-0.1', '-0.25']

The high points are those where diff changes from + to -, and the low
points are those where diff changes from - to +.

HTH,
-- 
Roberto Bonvallet



More information about the Python-list mailing list