[Tutor] least squares

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Fri Dec 17 20:22:50 CET 2004



On Thu, 16 Dec 2004, mdcooper wrote:


> I am trying to get a corrolation between a large number of variables and for
> many similar equations :
>    (Ca1 * xa^2) + (Ca2 * ya^2) + (Ca3 * za^2) + ... = ta
>    (Cb1 * xb^2) + (Cb2 * yb^2) + (Cb3 * zb^2) + ... = tb
>
> which is solved to get:
>    (C1 * x^2) + (C2 * y^2) + (C3 * z^2) + ... = t
>
>     where the submitted values of x,y, and z should give the correct t
>
> and I am using the code to get C1, C2, C3, .... These constants are not
> allowed to be negative and the submitted equations are such that there
> is no reason for the values to be negative, and although a good fit may
> allow them to be negative, another good fit will be found if they are
> all positive.


Hi Matt,


Ok.  I have to admit that my mathematical maturity is actually a little
low, but I'll try to follow along.  But this really doesn't sounds like a
problem specific to Python though, but more of a math problem.

So you may actually want to talk with folks with a real math background; I
would not be mathematically mature enough to know if your code is correct.
I'd strongly recommend talking to folks on the 'sci.math' or
'sci.math.num-analysis' newsgroups.


I haven't seen your code, so I'm still in the dark about how it works or
what its capabilities are.  I'll make a few assumptions that might not be
right, but I have to start somewhere.  *grin*

Does your code provide a way at getting at all possible solutions, or only
one particular solution?  If so, then you can just filter out for
solutions that satisfy the property you want.

For example, we can produce a function that generates all squares in the
world:

###
>>> import itertools
>>> def allSquares():
...     for i in itertools.count():
...         yield i*i
...
>>> squares = allSquares()
>>> squares.next()
0
>>> squares.next()
1
>>> squares.next()
4
>>> squares.next()
9
>>> squares.next()
16
>>> squares.next()
25
###


I can keep calling 'next()' on my 'squares' iteration, and keep getting
squares.  Even though this can produce an infinite sequence of answers, we
can still apply a filter on this sequence, and pick out the ones that are
"palindromic" in terms of their digits:

###
>>> def palindromic(n):
...    "Returns true if n is 'palindromic'."
...    return str(n) == str(n)[::-1]
...
>>> filteredSquares = itertools.ifilter(palindromic, allSquares())
>>> filteredSquares.next()
0
>>> filteredSquares.next()
1
>>> filteredSquares.next()
4
>>> filteredSquares.next()
9
>>> filteredSquares.next()
121
>>> filteredSquares.next()
484
>>> filteredSquares.next()
676
###


This sequence-filtering approach takes advantage of Python's ability to
work on a iteration of answers.  If your program can be written to produce
an infinite stream of answers, and if a solution set with all positive
coefficients is inevitable in that stream, then you can take this
filtering approach, and just capture the first solution that matches your
constraints.



Similarly, if your program only produces a single solution, does it do so
through an "iterative" algorithm?  By iterative, I mean: does it start off
with an initial guess and apply a process to improve that guess until the
solution is satisfactory?

For others on the Tutor list, here is an "iterative" way to produce the
square root of a number:

###
def mysqrt(x):
    guess = 1.0  ## initial guess
    while not goodEnough(guess, x):
        guess = improve(guess, x)
    return guess

def improve(guess, x):
    """Improves the guess of the square root of x."""
    return average(guess, x / guess)

def average(x, y):
    return (x + y) / 2.0

def goodEnough(guess, x):
    """Returns true if guess is close enough to the square root of x."""
    return abs(guess**2 - x) < 0.00001
###

(adapted/ripped off from material in SICP:
http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-10.html#%_sec_1.1.7)



If your program tries to solve the problem through an iterative process,
then, again, does it inevitably produce a solution where all the constant
coefficients 'Cn' are positive?  If so, maybe you can just continue to
produce better and better solutions until its satisfactory, until all the
coefficients are positive.


Otherwise, without looking at your code, I'm stuck.  *grin* And even if I
do see your code, I might be stuck still.  If your code is short, feel
free to post it up, and we'll see how far we can get.  But you really may
want to talk with someone who has a stronger math background.

Good luck to you!



More information about the Tutor mailing list