Hi all, I'd like to be able to do spline regression in patsy[1], which means that I need to be able to compute b-spline basis functions. I am not an initiate into the mysteries of practical spline computations, but I *think* the stuff in scipy.signal is not quite usable as is, because it's focused on doing interpolation directly rather than exposing the basis functions themselves? Specifically, to achieve feature parity with R [2], I need to be able to take - an arbitrary order - an arbitrary collection of knot positions (which may be irregularly spaced) - a vector x of points at which to evaluate the basis functions and spit out the value of each spline basis function evaluated at each point in the x vector. It looks like scipy.signal.bspline *might* be useful, but I can't quite tell? Or alternatively someone might have some code lying around to do this already? Basically I have a copy of Schumaker here and I'm hoping someone will save me from having to read it :-). -n [1] https://github.com/pydata/patsy/ [2] http://stat.ethz.ch/R-manual/R-devel/library/splines/html/bs.html http://stat.ethz.ch/R-manual/R-devel/library/splines/html/ns.html
On Tue, Jul 31, 2012 at 9:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I'd like to be able to do spline regression in patsy[1], which means that I need to be able to compute b-spline basis functions. I am not an initiate into the mysteries of practical spline computations, but I *think* the stuff in scipy.signal is not quite usable as is, because it's focused on doing interpolation directly rather than exposing the basis functions themselves?
Specifically, to achieve feature parity with R [2], I need to be able to take - an arbitrary order - an arbitrary collection of knot positions (which may be irregularly spaced) - a vector x of points at which to evaluate the basis functions and spit out the value of each spline basis function evaluated at each point in the x vector.
It looks like scipy.signal.bspline *might* be useful, but I can't quite tell? Or alternatively someone might have some code lying around to do this already?
Basically I have a copy of Schumaker here and I'm hoping someone will save me from having to read it :-).
I have this floating around def splvander(x, deg, knots): """Vandermonde type matrix for splines. Returns a matrix whose columns are the values of the b-splines of deg `deg` associated with the knot sequence `knots` evaluated at the points `x`. Parameters ---------- x : array_like Points at which to evaluate the b-splines. deg : int Degree of the splines. knots : array_like List of knots. The convention here is that the interior knots have been extended at both ends by ``deg + 1`` extra knots. Returns ------- vander : ndarray Vandermonde like matrix of shape (m,n), where ``m = len(x)`` and ``m = len(knots) - deg - 1`` Notes ----- The knots exending the interior points are usually taken to be the same as the endpoints of the interval on which the spline will be evaluated. """ m = len(knots) - deg - 1 v = np.zeros((m, len(x))) d = np.eye(m, len(knots)) for i in range(m): v[i] = spl.splev(x, (knots, d[i], deg)) return v.T
On Tue, Jul 31, 2012 at 9:46 AM, Charles R Harris <charlesr.harris@gmail.com
wrote:
On Tue, Jul 31, 2012 at 9:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I'd like to be able to do spline regression in patsy[1], which means that I need to be able to compute b-spline basis functions. I am not an initiate into the mysteries of practical spline computations, but I *think* the stuff in scipy.signal is not quite usable as is, because it's focused on doing interpolation directly rather than exposing the basis functions themselves?
Specifically, to achieve feature parity with R [2], I need to be able to take - an arbitrary order - an arbitrary collection of knot positions (which may be irregularly spaced) - a vector x of points at which to evaluate the basis functions and spit out the value of each spline basis function evaluated at each point in the x vector.
It looks like scipy.signal.bspline *might* be useful, but I can't quite tell? Or alternatively someone might have some code lying around to do this already?
Basically I have a copy of Schumaker here and I'm hoping someone will save me from having to read it :-).
I have this floating around
def splvander(x, deg, knots): """Vandermonde type matrix for splines.
Returns a matrix whose columns are the values of the b-splines of deg `deg` associated with the knot sequence `knots` evaluated at the points `x`.
Parameters ---------- x : array_like Points at which to evaluate the b-splines. deg : int Degree of the splines. knots : array_like List of knots. The convention here is that the interior knots have been extended at both ends by ``deg + 1`` extra knots.
Returns ------- vander : ndarray Vandermonde like matrix of shape (m,n), where ``m = len(x)`` and ``m = len(knots) - deg - 1``
Notes ----- The knots exending the interior points are usually taken to be the same as the endpoints of the interval on which the spline will be evaluated.
""" m = len(knots) - deg - 1 v = np.zeros((m, len(x))) d = np.eye(m, len(knots)) for i in range(m): v[i] = spl.splev(x, (knots, d[i], deg)) return v.T
With this import from scipy.interpolate import fitpack as spl Chuck
On Tue, Jul 31, 2012 at 4:46 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Jul 31, 2012 at 9:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I'd like to be able to do spline regression in patsy[1], which means that I need to be able to compute b-spline basis functions. I am not an initiate into the mysteries of practical spline computations, but I *think* the stuff in scipy.signal is not quite usable as is, because it's focused on doing interpolation directly rather than exposing the basis functions themselves?
Specifically, to achieve feature parity with R [2], I need to be able to take - an arbitrary order - an arbitrary collection of knot positions (which may be irregularly spaced) - a vector x of points at which to evaluate the basis functions and spit out the value of each spline basis function evaluated at each point in the x vector.
It looks like scipy.signal.bspline *might* be useful, but I can't quite tell? Or alternatively someone might have some code lying around to do this already?
Basically I have a copy of Schumaker here and I'm hoping someone will save me from having to read it :-).
I have this floating around [...] v[i] = spl.splev(x, (knots, d[i], deg))
This looks fabulous, thank you! From a quick look it seems to be producing numerically identical results to R's spline.des(). The bit with the coefficient vectors like [0, 0, 0, 1, 0] worries me a bit -- do you know if it's producing the entire spline basis internally on every call, and then throwing away most of them when forming the inner product? If so then in practice it's probably not a show-stopper, but it would be a pointless factor-of-len(knots) increase in running time for no reason. -n
On Wed, Aug 1, 2012 at 10:53 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Jul 31, 2012 at 4:46 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Tue, Jul 31, 2012 at 9:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I'd like to be able to do spline regression in patsy[1], which means that I need to be able to compute b-spline basis functions. I am not an initiate into the mysteries of practical spline computations, but I *think* the stuff in scipy.signal is not quite usable as is, because it's focused on doing interpolation directly rather than exposing the basis functions themselves?
Specifically, to achieve feature parity with R [2], I need to be able to take - an arbitrary order - an arbitrary collection of knot positions (which may be irregularly spaced) - a vector x of points at which to evaluate the basis functions and spit out the value of each spline basis function evaluated at each point in the x vector.
It looks like scipy.signal.bspline *might* be useful, but I can't quite tell? Or alternatively someone might have some code lying around to do this already?
Basically I have a copy of Schumaker here and I'm hoping someone will save me from having to read it :-).
I have this floating around [...] v[i] = spl.splev(x, (knots, d[i], deg))
This looks fabulous, thank you! From a quick look it seems to be producing numerically identical results to R's spline.des().
The bit with the coefficient vectors like [0, 0, 0, 1, 0] worries me a bit -- do you know if it's producing the entire spline basis internally on every call, and then throwing away most of them when forming the inner product? If so then in practice it's probably not a show-stopper, but it would be a pointless factor-of-len(knots) increase in running time for no reason.
Yes it is. It was a quick and dirty routine and producing the array wasn't the bottleneck. Also, since the b-splines themselves have small support, a lot of extra zero values are produced. It wouldn't be difficult to improve things quite a bit using a hacked version of the splev.f routine (De Boor's algorithm <http://en.wikipedia.org/wiki/De_Boor%27s_algorithm>). IIRC, the b-splines in signal are periodic b-splines with reflected boundary conditions, so aren't as general. I could be wrong about that, however. Chuck
On Tue, Jul 31, 2012 at 11:37 AM, Nathaniel Smith <njs@pobox.com> wrote:
Hi all,
I'd like to be able to do spline regression in patsy[1], which means that I need to be able to compute b-spline basis functions. I am not an initiate into the mysteries of practical spline computations, but I *think* the stuff in scipy.signal is not quite usable as is, because it's focused on doing interpolation directly rather than exposing the basis functions themselves?
Specifically, to achieve feature parity with R [2], I need to be able to take - an arbitrary order - an arbitrary collection of knot positions (which may be irregularly spaced) - a vector x of points at which to evaluate the basis functions and spit out the value of each spline basis function evaluated at each point in the x vector.
It looks like scipy.signal.bspline *might* be useful, but I can't quite tell? Or alternatively someone might have some code lying around to do this already?
Josef will know more about this. I think he cleaned it up to work with scipy instead of the segfaulting C code we had. We've been carrying it around for a while, but I haven't had a chance to brush up yet. https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/b... Skipper
participants (4)
-
Charles R Harris -
Nathaniel Smith -
Neal Becker -
Skipper Seabold