[Numpy-discussion] polyfit with fixed points

Wed Mar 6 19:29:24 EST 2013

On Wed, Mar 6, 2013 at 4:52 PM, Jaime Fernández del Río <
jaime.frio at gmail.com> wrote:

> On Tue, Mar 5, 2013 at 5:23 AM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Tue, Mar 5, 2013 at 12:41 AM, Jaime Fernández del Río <
>> jaime.frio at gmail.com> wrote:
>>
>>> On Mon, Mar 4, 2013 at 8:37 PM, Charles R Harris <
>>> charlesr.harris at gmail.com> wrote:
>>>
>>>>
>>>> There are actually seven versions of polynomial fit, two for the usual
>>>> polynomial basis, and one each for Legendre, Chebyshev, Hermite, Hermite_e,
>>>> and Laguerre ;)
>>>>
>>>
>>> Correct me if I am wrong, but the fitted function is the same regardless
>>> of the polynomial basis used. I don't know if there can be numerical
>>> stability issues, but chebfit(x, y, n) returns the same as
>>> poly2cheb(polyfit(x, y, n)).
>>>
>>> In any case, with all the already existing support for these special
>>> polynomials, it wouldn't be too hard to set the problem up to calculate the
>>> right coefficients directly for each case.
>>>
>>>
>>>> How do you propose to implement it? I think Lagrange multipliers is
>>>> overkill, I'd rather see using the weights (approximate) or change of
>>>> variable -- a permutation in this case -- followed by qr and lstsq.
>>>>
>>>
>>> The weights method is already in place, but I find it rather inelegant
>>> and unsatisfactory as a solution to this problem. But if it is deemed
>>> sufficient, then there is of course no need to go any further.
>>>
>>> I hadn't thought of any other way than using Lagrange multipliers, but
>>> looking at it in more detail, I am not sure it will be possible to
>>> formulate it in a manner that can be fed to lstsq, as polyfit does today.
>>> And if it can't, it probably wouldn't make much sense to have two different
>>> methods which cannot produce the same full output running under the same
>>> hood.
>>>
>>> I can't figure out your "change of variable" method from the succinct
>>> description, could you elaborate a little more?
>>>
>>
>> I think the place to add this is to lstsq as linear constraints. That is,
>> the coefficients must satisfy B * c = y_c for some set of equations B. In
>> the polynomial case the rows of B would be the powers of x at the points
>> you want to constrain. Then do an svd on B, B = u * d * v. Apply v to the
>> design matrix of the unconstrained points A'  = A * v.T so that B' becomes
>> u * d. The coefficients are now replaced by new variables c' with the
>> contraints in the first two columns. If there are, say, 2 constraints, u *
>> d will be 2x2. Solve that equation for the first two constraints then
>> multiply the first two columns of the design matrix A' by the result and
>> put them on the rhs, i.e.,
>>
>>     y = y - A'[:, :2] * c'[:2]
>>
>> then solve the usual l least squares thing with
>>
>>     A[:, 2:] * c'[2:] = y
>>
>> to get the rest of the transformed coefficients c'. Put the coefficients
>> altogether and multiply with v^T to get
>>
>>     c = v^T * c'
>>
>
> Very nice, and works beautifully! I have tried the method you describe,
> and there are a few relevant observations:
>
>  1. It gives the exact same result as the Lagrange multiplier approach,
> which is probably expected, but I wasn't all that sure it would be the case.
>

It's equivalent, but I'm thinking in algorithmic terms, which is somewhat
more specific than the mathematical formulation.

>  2. The result also seems to be to what the sequence of fits giving
> increasing weights to the fixed points converges to. This image
> http://i1092.photobucket.com/albums/i412/jfrio/image.png is an example.
> In there:
>      * blue crosses are the data points to fit to
>      * red points are the fixed points
>      * blue line is the standard polyfit
>      * red line is the constrained polyfit
>      * cyan, magenta, yellow and black are polyfits with weights of 2, 4,
> 8, 16 for the fixed points, 1 for the rest
>
> Seeing this last point, probably the cleanest, least disruptive
> implementation of this, would be to allow np.inf values in the weights
> parameter, which would get filtered out, and dealt with in the above manner.
>
>
Interesting idea, I like it. It is less general, but probably all that is
needed for polynomial fits. I suppose that after you pull out the relevant
rows you can set the weights to zero so that they will have no (order of
roundoff) effect on the remaining fit and you don't need to rewrite the
design matrix.

> So I have two questions:
>
>  1. Does this make sense? Or will it be better to make it more explicit,
> with a 'fixed_points' keyword argument defaulting to None?
>  2. Once I have this implemented, documented and tested... How do I go
> about submitting it for consideration? Would a patch be the way to go, or
> should I fork?
>
>
A fork is definitely the way to go. That makes it easy for folks to review
the code and tell you everything you did wrong ;)

I think adding linear constraints to lstsq would be good, then the upper
level routines can make use of them. Something like a new argument
constraints=(B, y), with None the default.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130306/c78e1ceb/attachment.html>