[Numpy-discussion] Inversion of near singular matrices.

Sat Jan 29 17:10:30 EST 2011

Den 29.01.2011 12:40, skrev Algis Kabaila:
> So my question is: how can one reliably detect singularity (or
> near singularity) and raise an exception?

Use an SVD, examine the singular values. One or more small singular 
values indicate ill-conditioning. (What constitutes a small singular 
value is a debate I'll avoid.) The SVD will also let you fix the problem 
when inverting the matrix, simply by truncating them to 0. SVD is slow, 
but the advantage is that it cannot fail. If SVD is deemed too slow, you 
might look for a more specialized solution. But always try to understand 
the problem, don't just use SVD as a universal solution to any 
ill-conditioned matrix problem, even if it's tempting.

In statistics we sometimes see ill-conditioning of covariance matrices. 
Another way to deal with multicollinearity besides SVD/PCA is 
regularisation. Simply adding a small bias k*I to the diagonal might fix 
the problem (cf. ridge regression). In the Levenberg-Marquardt algorithm 
used to fit non-linear least squares models (cf. 
scipy.optimize.leastsq), the bias k to the diagonal of the Jacobian is 
changed adaptively. One might also know in advance if a covariance 
matrix could be ill-conditioned (the number of samples is small compared 
to the number of dimensions) or singular (less data than parameters). 
That is, sometimes we don't even need to look at the matrix to give the 
correct diagnosis. Another widely used strategy is to use Cholesky 
factorization on covariance matrices. It is always stable unless there 
is a singularity, for which it will fail (NumPy will raise a LinAlgError 
exception). Cholesky is therefore safer to use for inverting covariance 
matrices than LU (as well as faster). If Cholesky fails one might 
fallback to SVD or regularisation to correct the problem.

Sturla