[Numpy-discussion] Efficient orthogonalisation with scipy/numpy
Charles R Harris
charlesr.harris at gmail.com
Tue Jan 19 22:13:37 EST 2010
On Tue, Jan 19, 2010 at 8:02 PM, <josef.pktd at gmail.com> wrote:
> On Tue, Jan 19, 2010 at 9:47 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Tue, Jan 19, 2010 at 6:08 PM, <josef.pktd at gmail.com> wrote:
> >>
> >> On Tue, Jan 19, 2010 at 6:48 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> >
> >> >
> >> > On Tue, Jan 19, 2010 at 2:34 PM, <josef.pktd at gmail.com> wrote:
> >> >>
> >> >> On Tue, Jan 19, 2010 at 4:29 PM, Charles R Harris
> >> >> <charlesr.harris at gmail.com> wrote:
> >> >> >
> >> >> >
> >> >> > On Tue, Jan 19, 2010 at 1:47 PM, Gael Varoquaux
> >> >> > <gael.varoquaux at normalesup.org> wrote:
> >> >> >>
> >> >> >> On Tue, Jan 19, 2010 at 02:22:30PM -0600, Robert Kern wrote:
> >> >> >> > > y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y)[0])
> >> >> >>
> >> >> >> > > with np = numpy and linalg = scipy.linalg where scipy calls
> >> >> >> > > ATLAS.
> >> >> >>
> >> >> >> > For clarification, are you trying to find the components of the
> y
> >> >> >> > vectors that are perpendicular to the space spanned by the 10
> >> >> >> > orthonormal vectors in confounds?
> >> >> >>
> >> >> >> Yes. Actually, what I am doing is calculating partial correlation
> >> >> >> between
> >> >> >> x and y conditionally to confounds, with the following code:
> >> >> >>
> >> >> >> def cond_partial_cor(y, x, confounds=[]):
> >> >> >> """ Returns the partial correlation of y and x, conditionning
> on
> >> >> >> confounds.
> >> >> >> """
> >> >> >> # First orthogonalise y and x relative to confounds
> >> >> >> if len(confounds):
> >> >> >> y = y - np.dot(confounds.T, linalg.lstsq(confounds.T,
> y)[0])
> >> >> >> x = x - np.dot(confounds.T, linalg.lstsq(confounds.T,
> x)[0])
> >> >> >> return np.dot(x, y)/sqrt(np.dot(y, y)*np.dot(x, x))
> >> >> >>
> >> >> >> I am not sure that what I am doing is optimal.
> >> >> >>
> >> >> >> > > Most of the time is spent in linalg.lstsq. The length of the
> >> >> >> > > vectors
> >> >> >> > > is
> >> >> >> > > 810, and there are about 10 confounds.
> >> >> >>
> >> >> >> > Exactly what are the shapes? y.shape = (810, N); confounds.shape
> =
> >> >> >> > (810,
> >> >> >> > 10)?
> >> >> >>
> >> >> >> Sorry, I should have been more precise:
> >> >> >>
> >> >> >> y.shape = (810, )
> >> >> >> confounds.shape = (10, 810)
> >> >> >>
> >> >> >
> >> >> > Column stack the bunch so that the last column is y, then do a qr
> >> >> > decomposition. The last column of q is the (normalized) orthogonal
> >> >> > vector
> >> >> > and its amplitude is the last (bottom right) component of r.
> >> >>
> >> >> do you have to do qr twice, once with x and once with y in the last
> >> >> column or can this be combined?
> >> >>
> >> >> I was trying to do something similar for partial autocorrelation for
> >> >> timeseries but didn't manage or try anything better than repeated
> >> >> leastsq or a variant.
> >> >>
> >> >
> >> > Depends on what you want to do. The QR decomposition is essentially
> >> > Gram-Schmidt on the columns. So if you just want an orthonormal basis
> >> > for
> >> > the subspace spanned by a bunch of columns, the columns of Q are they.
> >> > To
> >> > get the part of y orthogonal to that subspace you can do y - Q*Q.T*y,
> >> > which
> >> > is probably what you want if the x's are fixed and the y's vary. If
> >> > there is
> >> > just one y, then putting it as the last column lets the QR algorithm
> do
> >> > that
> >> > last bit of projection.
> >>
> >> Gram-Schmidt (looking at it for the first time) looks a lot like
> >> sequential least squares projection. So, I'm trying to figure out if I
> >> can use the partial results up to a specific column as partial least
> >> squares and then work my way to the end by including/looking at more
> >> columns.
> >>
> >
> > I don't the QR factorization would work for normal PLS. IIRC, one of the
> > algorithms does a svd of the cross correlation matrix. The difference is
> > that in some sense the svd picks out the best linear combination of
> columns,
> > while the qr factorization without column pivoting just takes them in
> order.
> > The QR factorization used to be the method of choice for least squares
> > because it is straight forward to compute, no iterating needed as in svd,
> > but these days that advantage is pretty much gone. It is still a common
> > first step in the svd, however. The matrix is factored to Q*R, then the
> svd
> > of R is computed.
>
> I (finally) figured out svd and eigenvalue decomposition for this purpose.
>
> But from your description of QR, I thought specifically of the case
> where we have a "natural" ordering of the regressors, similar to the
> polynomial case of you and Anne. In the timeseries case it would be by
> increasing lags
>
> yt on y_{t-1}
> yt on y_{t-1}, y_{t-2}
> ...
> ...
> yt on y_{t-k} for k= 1,...,K
>
> or yt on xt and the lags of xt
>
> This is really sequential LS with a predefined sequence, not PLS or
> PCA/PCR or similar orthogonalization by "importance".
> The usual procedure for deciding on the appropriate number of lags
> usually loops over OLS with increasing number of regressors.
> >From the discussion, I thought there might be a way to "cheat" in this
> using QR and Gram-Schmidt
>
>
Ah, then I think your idea would work. The norms of the residuals at each
step would be along the diagonal of the R matrix. They won't necessarily
decrease monotonically, however.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100119/5136672c/attachment.html>
More information about the NumPy-Discussion
mailing list