[SciPy-User] Orthogonal distance regression in 3D

Владимир draft2008 at bk.ru
Tue Mar 6 03:22:15 EST 2012




05 марта 2012, 14:59 от Robert Kern <robert.kern at gmail.com>:
> On Mon, Mar 5, 2012 at 10:26, Владимир <draft2008 at bk.ru> wrote:
> > 02 марта 2012, 15:49 от Robert Kern <robert.kern at gmail.com>:
> >> On Fri, Mar 2, 2012 at 06:02, Владимир <draft2008 at bk.ru> wrote:
> >> > Hello!
> >> > I'm working with orthogonal distance regression (scipy.odr).
> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
> >> > returns wrong results
> >> >
> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
> >> > cloud (y_data, x_data, z_data)
> >> >
> >> >
> >> >     def func(p, input):
> >> >
> >> >     x,z = input
> >> >
> >> >     x = np.array(x)
> >> >
> >> >     z = np.array(z)
> >> >
> >> >     return (p[0]*x + p[1]*z + p[2])
> >> >
> >> >
> >> >     initialGuess = [1,1,1]
> >> >
> >> >     myModel = Model(func)
> >> >
> >> >     myData = Data([x_data, z_daya], y_data)
> >> >
> >> >     myOdr = ODR(myData, myModel, beta0 = initialGuess)
> >> >
> >> >     myOdr.set_job(fit_type=0)
> >> >
> >> >     out = myOdr.run()
> >> >
> >> >     print out.beta
> >> >
> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
> >> > are not even close to real, moreover it is very sensitive to initial Guess,
> >> > so it returns different result even if i change InitiaGuess from [1,1,1]
> >> > to [0.99,1,1]
> >> >
> >> > What do I do wrong?
> >>
> >> Can you provide a complete runnable example including some data? Note
> >> that if you do not specify any errors on your data, they are assumed
> >> to correspond to a standard deviation of 1 for all dimensions. If that
> >> is wildly different from the actual variance around the "true"
> >> surface, then it might lead the optimizer astray.
> >>
> >> --
> >> Robert Kern
> >>
> >
> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
> >
> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
> >
> > import numpy as np
> > from scipy.odr import *
> > from math import *
> 
> [snip]
> 
> > def funcReturner(p, input):
> >        input = np.array(input)
> >        x = input[0]
> >        z = input[1]
> >        return 10**(p[0]*x + p[1]*z +p[2])
> 
> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
> problem you initially asked about. Setting the uncertainties
> accurately on all axes of your data is essential. Do you really know
> what they are? It's possible that you want to try fitting a plane to
> np.log10(y_data) instead.
> 
> > myModel = Model(funcReturner)
> > myData = Data([x_data,z_data], y_data)
> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02,  1.75])
> > myOdr.set_job(fit_type=0)
> > out = myOdr.run()
> > result = out.beta
> >
> > print "Optimal coefficients: ", result
> >
> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
> 
> For such a nonlinear problem, finding reasonable initial guesses is
> useful. There is also a maximum iteration limit defaulting to a fairly
> low 50. Check out.stopreason to see if it actually converged or just
> ran into the iteration limit. You can keep calling myOdr.restart()
> until it converges. If I start with beta0=[1,1,1], it converges
> somewhere between 300 and 400 iterations.
> 
> --
> Robert Kern
> 

Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?

I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.


More information about the SciPy-User mailing list