![](https://secure.gravatar.com/avatar/2d36e13a258f69ed8e17441a33cd5b18.jpg?s=120&d=mm&r=g)
i was just wondering if there is a simple way to define a step function in scipy, like an ECDF (empirical cumulative distribution function) in statistics.... i have hacked my own version, but it is quite slow.... thanks, jonathan -- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society! http://www.active.com/donate/tntsvmb/tntsvmbJTaylor GO TEAM !!! ------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Jonathan Taylor wrote:
i was just wondering if there is a simple way to define a step function in scipy, like an ECDF (empirical cumulative distribution function) in statistics....
i have hacked my own version, but it is quite slow....
scipy.interpolate.interp1d has an option for nearest-neighbor interpolation which is almost what you want except that the provided data points are in the center of the steps instead of at the left or right. What does your version look like? Let's see if we can't improve upon it. -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/2d36e13a258f69ed8e17441a33cd5b18.jpg?s=120&d=mm&r=g)
believe it or not, i just discovered "searchsorted"... my earlier hack was brutal, using "bisect".... this is my current version and it seems to work fine. class StepFunction: '''A basic step function: values at the ends are handled in the simplest way possible: everything to the left of x[0] is set to ival; everything to the right of x[-1] is set to y[-1]. >>> >>> from numpy import * >>> >>> x = arange(20) >>> y = arange(20) >>> >>> f = StepFunction(x, y) >>> >>> print f(3.2) 3 >>> print f([[3.2,4.5],[24,-3.1]]) [[ 3 4] [19 0]] >>> ''' def __init__(self, x, y, ival=0., sorted=False): _x = N.array(x) _y = N.array(y) if _x.shape != _y.shape: raise ValueError, 'in StepFunction: x and y do not have the same shape' if len(_x.shape) != 1: raise ValueError, 'in StepFunction: x and y must be 1-dimensional' self.x = N.array([-N.inf] + list(x)) self.y = N.array([ival] + list(y)) if not sorted: asort = N.argsort(self.x) self.x = N.take(self.x, asort) self.y = N.take(self.y, asort) self.n = self.x.shape[0] def __call__(self, time): tind = scipy.searchsorted(self.x, time) - 1 _shape = tind.shape if tind.shape is (): return self.y[tind] else: tmp = N.take(self.y, tind) tmp.shape = _shape return tmp Robert Kern wrote:
Jonathan Taylor wrote:
i was just wondering if there is a simple way to define a step function in scipy, like an ECDF (empirical cumulative distribution function) in statistics....
i have hacked my own version, but it is quite slow....
scipy.interpolate.interp1d has an option for nearest-neighbor interpolation which is almost what you want except that the provided data points are in the center of the steps instead of at the left or right.
What does your version look like? Let's see if we can't improve upon it.
-- ------------------------------------------------------------------------ I'm part of the Team in Training: please support our efforts for the Leukemia and Lymphoma Society! http://www.active.com/donate/tntsvmb/tntsvmbJTaylor GO TEAM !!! ------------------------------------------------------------------------ Jonathan Taylor Tel: 650.723.9230 Dept. of Statistics Fax: 650.725.8977 Sequoia Hall, 137 www-stat.stanford.edu/~jtaylo 390 Serra Mall Stanford, CA 94305
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Jonathan Taylor wrote:
believe it or not, i just discovered "searchsorted"... my earlier hack was brutal, using "bisect"....
Ouch! Yeah, that may be why it was slow. :-)
this is my current version and it seems to work fine.
It looks good. I'll just make a few comments inline.
class StepFunction: '''A basic step function: values at the ends are handled in the simplest way possible: everything to the left of x[0] is set to ival; everything to the right of x[-1] is set to y[-1].
>>> >>> from numpy import * >>> >>> x = arange(20) >>> y = arange(20) >>> >>> f = StepFunction(x, y) >>> >>> print f(3.2) 3 >>> print f([[3.2,4.5],[24,-3.1]]) [[ 3 4] [19 0]] >>>
'''
def __init__(self, x, y, ival=0., sorted=False):
_x = N.array(x) _y = N.array(y)
N.asarray() will be a tad more efficient.
if _x.shape != _y.shape: raise ValueError, 'in StepFunction: x and y do not have the same shape' if len(_x.shape) != 1: raise ValueError, 'in StepFunction: x and y must be 1-dimensional'
self.x = N.array([-N.inf] + list(x)) self.y = N.array([ival] + list(y))
You can do this more efficiently by using numpy.concatenate() (or numpy.hstack() which happens to be equivalent in this particular case). self.x = N.hstack(([-N.inf], _x))
if not sorted: asort = N.argsort(self.x) self.x = N.take(self.x, asort) self.y = N.take(self.y, asort) self.n = self.x.shape[0]
def __call__(self, time):
tind = scipy.searchsorted(self.x, time) - 1 _shape = tind.shape if tind.shape is (): return self.y[tind] else: tmp = N.take(self.y, tind) tmp.shape = _shape return tmp
self.y[tind] will give the correct answer regardless. In numpy, you can index into an array using another array. The shape of the returned array will have the shape of the index array. -- Robert Kern robert.kern@gmail.com "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
participants (2)
-
Jonathan Taylor
-
Robert Kern