
Hi, In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows. We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task. scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors. I think, in order to make speed reasonable, I will need to loop over the rows in compiled code. My questions are: a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users? Cheers, Matthew

What's the interpolation order? * Warren recently added nd functionality to splev: https://github.com/scipy/scipy/pull/4059 * Have you tried polynomial interpolators? PPoly, BPoly.from_derivatives might be what you are after On Tue, Apr 14, 2015 at 11:12 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

Hi, On Tue, Apr 14, 2015 at 3:24 PM, Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
What's the interpolation order?
Cubic spline, linear, nearest neighbor in that order of frequency.
* Warren recently added nd functionality to splev: https://github.com/scipy/scipy/pull/4059
A quick scan suggested this allows an N-D array of x input values, whereas I want to evaluate a 1D array of x input values for 2- or N-D array of y values - is that right?
* Have you tried polynomial interpolators? PPoly, BPoly.from_derivatives might be what you are after
Forgive my terrible ignorance, but is it easy to do cubic spline interpolation using these classes? Thanks for the reply, Matthew

On Tue, Apr 14, 2015 at 6:39 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Tue, Apr 14, 2015 at 3:24 PM, Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
What's the interpolation order?
Cubic spline, linear, nearest neighbor in that order of frequency.
* Warren recently added nd functionality to splev: https://github.com/scipy/scipy/pull/4059
A quick scan suggested this allows an N-D array of x input values, whereas I want to evaluate a 1D array of x input values for 2- or N-D array of y values - is that right?
If you have the same 200 x values for all the y interpolation, then you can reuse the projection matrix which is only 200x200. Similar idea as in Savitsky-Golay. It would be just one big dot product 200x200 dot 200x4000, after creating the projection matrix (not a how to use scipy answer) Josef
* Have you tried polynomial interpolators? PPoly, BPoly.from_derivatives might be what you are after
Forgive my terrible ignorance, but is it easy to do cubic spline interpolation using these classes?
Thanks for the reply,
Matthew _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

* Have you tried polynomial interpolators? PPoly, BPoly.from_derivatives might be what you are after
Forgive my terrible ignorance, but is it easy to do cubic spline interpolation using these classes?
If you are prepared to estimate the derivatives somehow, then yes. Akima1DInterpolator and PchipInterpolator do it (pchip is a mess because of the axis argument). Otherwise, it's probably easier to have the b-splines PR merged than to add a cython shim to fitpack fortran code or add the functionality directly in fortran (which is also doable).
Thanks for the reply,
Matthew _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp). To solve this problem efficiently, I wrote the code I posted here: http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow. HTH, Jonathan

On Wed, Apr 15, 2015 at 10:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp). To solve this problem efficiently, I wrote the code I posted here:
http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation
nice, 3 questions Can you construct P = Xp*inv(X)*H directly from the block diagonal parts instead of going through sparse block-diagonal matrices? Why do you need monotonically increasing x? I would have thought it works for any x (I'm not clear about this) My guess is that replacing inv by pinv would be more stable. But, would pinv also be able to handle the case when there are some identical x (by implicitly taking an average)? Josef
Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow.
HTH, Jonathan _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

On Wed, Apr 15, 2015 at 11:04 AM, <josef.pktd@gmail.com> wrote:
On Wed, Apr 15, 2015 at 10:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp). To solve this problem efficiently, I wrote the code I posted here:
http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation
nice,
3 questions
Can you construct P = Xp*inv(X)*H directly from the block diagonal parts instead of going through sparse block-diagonal matrices?
Why do you need monotonically increasing x? I would have thought it works for any x
(I'm not clear about this)
My guess is that replacing inv by pinv would be more stable. But, would pinv also be able to handle the case when there are some identical x (by implicitly taking an average)?
Something is wrong in the way I think about this. (I might be thinking about a different use case where I was recently using Kernel Ridge Regression.) Sorry for any noise Josef
Josef
Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow.
HTH, Jonathan _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

On Wed, Apr 15, 2015 at 11:24 AM, <josef.pktd@gmail.com> wrote:
On Wed, Apr 15, 2015 at 11:04 AM, <josef.pktd@gmail.com> wrote:
On Wed, Apr 15, 2015 at 10:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp). To solve this problem efficiently, I wrote the code I posted here:
http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation
nice,
3 questions
Can you construct P = Xp*inv(X)*H directly from the block diagonal parts instead of going through sparse block-diagonal matrices?
Why do you need monotonically increasing x? I would have thought it works for any x
(I'm not clear about this)
My guess is that replacing inv by pinv would be more stable. But, would pinv also be able to handle the case when there are some identical x (by implicitly taking an average)?
Something is wrong in the way I think about this. (I might be thinking about a different use case where I was recently using Kernel Ridge Regression.)
Sorry for any noise
The projection matrix for interpolation would need to have shape (number of new points to be interpolated, number of sample points) Josef
Josef
Josef
Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow.
HTH, Jonathan _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

On 4/15/15 09:24 , josef.pktd@gmail.com wrote:
On Wed, Apr 15, 2015 at 11:04 AM, <josef.pktd@gmail.com> wrote:
On Wed, Apr 15, 2015 at 10:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp). To solve this problem efficiently, I wrote the code I posted here:
http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation
nice,
3 questions
Can you construct P = Xp*inv(X)*H directly from the block diagonal parts instead of going through sparse block-diagonal matrices?
Perhaps. It has been 4 years since I worked on this, and so I don't remember all the details. What I did was use a loop (via list comprehension for speed) to calculate the inverse of all the small block matrices, then constructed the sparse Xinv. There may very well be a more elegant approach.
Why do you need monotonically increasing x? I would have thought it works for any x
(I'm not clear about this)
I don't remember why x needs to be monotonic. I am usually careful to consider these things, so I am sure that there were at least nontrivial implementation problems if x is scattered (or repeated). Or maybe I just assumed that it would be easy for users (me!) to sort the data fist and didn't want to take the time to worry about it.
My guess is that replacing inv by pinv would be more stable. But, would pinv also be able to handle the case when there are some identical x (by implicitly taking an average)?
I am not so knowledgeable here. By taking the direct inverse of the individual small block matrices, I think the inverse method I used is quite stable.
Something is wrong in the way I think about this. (I might be thinking about a different use case where I was recently using Kernel Ridge Regression.)
Sorry for any noise
No worries. Nice to see some interest. Jonathan
Josef
Josef
Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow.
HTH, Jonathan _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev

Hi, On Wed, Apr 15, 2015 at 7:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp).
Sorry not to be clear; that is my case too. (1D x, xp, ND y, yp).
To solve this problem efficiently, I wrote the code I posted here:
http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation
Thanks for the pointer.
Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow.
Just to check - you looping over vectors in y, and using splrep/splev on each vector? Cheers, Matthew

On 4/15/15 11:57 , Matthew Brett wrote:
Hi,
On Wed, Apr 15, 2015 at 7:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp).
Sorry not to be clear; that is my case too. (1D x, xp, ND y, yp).
To solve this problem efficiently, I wrote the code I posted here:
http://scipy-central.org/item/21/1/simple-piecewise-polynomial-interpolation
Thanks for the pointer.
Although it is not true spline interpolation (rather a simpler local polynomial interpolation), it is reasonably fast and accurate. Extrapolation is allowed. However, I later found that scipy.interpolate.splrep/splev are so fast that they could be used instead with equivalent speed for my problem. By comparison, scipy.interpolate.interp1d is quite slow.
Just to check - you looping over vectors in y, and using splrep/splev on each vector?
Essentially, yes. My application was inside an system of ODEs, which was called many times by an ODE solver. Jonathan

On Wed, Apr 15, 2015 at 6:57 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Wed, Apr 15, 2015 at 7:33 AM, Jonathan Stickel <jjstickel@gmail.com> wrote:
On 4/14/15 16:12 , Matthew Brett wrote:
Hi,
In brain imaging (my world) - we need to do 1D interpolation of large datasets, specifically, we have data in the order of (X, Y) = (200, 4000), where we are interpolating at the same ~200 X locations for each of the 4000 rows.
We need to do this with extrapolation at the boundaries, so I think we can't use scipy.interpolate.interp1d which is otherwise well suited for the task.
scipy.interpolate.InterpolatedUnivariateSpline does deal with extrapolation, but only works with vectors.
I think, in order to make speed reasonable, I will need to loop over the rows in compiled code.
My questions are:
a) how easy would it be to try and link some external Cython code against the fitpack code in scipy? b) if that is not easy, what can I do that will also be of most use to other scipy users?
Cheers,
Matthew
I am not sure if I understand your problem statement. Most interpolation problems are posed as: given data (x,y), determine yp at xp, where xp are different from x. This is fine for cases where you have a single y and possibly multiple sets xp. I had case awhile ago where I had a single xp and multiple y (at the same x). In this case, the problem statement is: given known locations x and interpolant locations xp, what are the values for yp (corresponding to xp) given y (corresponding to xp).
Sorry not to be clear; that is my case too. (1D x, xp, ND y, yp).
With https://github.com/scipy/scipy/pull/3174:
from scipy.interpolate import BSpline, make_interp_spline
n, k = 22, 3 x = np.sort(np.random.random(n)) y = np.random.random((n, 4, 5, 6))
tck = make_interp_spline(x, y) tck[-1] 3 spl = BSpline(*tck) spl.extrapolate True xp = np.random.random((51, 3)) spl(xp).shape (51, 3, 4, 5, 6)
tck format is consistent with splev:
cc = spl.c.reshape((22, -1))
for j in range(cc.shape[1]): ... tck_ = (spl.t, cc[:, j], spl.k) ... y0 = splev(0.5, tck_) ... y1 = BSpline(*tck_)(0.5) ... assert_allclose(y0, y1)
participants (4)
-
Evgeni Burovski
-
Jonathan Stickel
-
josef.pktd@gmail.com
-
Matthew Brett