[Matrix-SIG] Bug in LinearAlgebra.linear_least_squares()

Greg Kochanski gpk@bell-labs.com
Mon, 31 Jan 2000 12:30:55 -0500


The documentation string states:
	If the rank of A is less than the number of columns of A
	or greater than the numer of rows, then residuals will be
	returned as an empty array otherwise resids = sum((b-dot(A,x)**2).

This is not quite correct. 
One can have a fit that has a low rank,
(i.e., the basis functions are degenerate) that still
doesn't completely explain the data.

For instance, fitting the sequence [1.1, 2, 3, 4]
with basis functions [1, 2, 3, 4] and [0, 0, 0, 0]
will yield a rank of 1,
and ought to yield a nonzero residual.
Granted, LAPACK is the lazy one, failing to calculate
the residual,
and the LinearAlgebra module is just passing it on,
but the absence of the residual
ought to be fixed or at least noted as a bug in the
documentation.


>>> 
>>> a
array([[ 1.,  0.],
       [ 2.,  0.],
       [ 3.,  0.],
       [ 4.,  0.]])
>>> b
array([ 1.1,  2. ,  3. ,  4. ])
>>> 
>>> LinearAlgebra.linear_least_squares(a, b)
(array([ 1.00333333,  0.        ]),
	zeros((0,), 'd'),
	1,
	array([ 5.47722558,  0.        ]))
>>>
>>> Numeric.matrixmultiply(a, [ 1.00333333,  0.        ])
array([ 1.00333333,  2.00666666,  3.00999999,  4.01333332])
>>> # Note that we don't quite recover the original sequence,
# so the residuals are nonzero.