# [Numpy-discussion] Efficient orthogonalisation with scipy/numpy

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Jan 19 16:34:09 EST 2010

```On Tue, Jan 19, 2010 at 4:29 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Jan 19, 2010 at 1:47 PM, Gael Varoquaux
> <gael.varoquaux at normalesup.org> wrote:
>>
>> On Tue, Jan 19, 2010 at 02:22:30PM -0600, Robert Kern wrote:
>> > > y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y))
>>
>> > > with np = numpy and linalg = scipy.linalg where scipy calls ATLAS.
>>
>> > For clarification, are you trying to find the components of the y
>> > vectors that are perpendicular to the space spanned by the 10
>> > orthonormal vectors in confounds?
>>
>> Yes. Actually, what I am doing is calculating partial correlation between
>> x and y conditionally to confounds, with the following code:
>>
>> def cond_partial_cor(y, x, confounds=[]):
>>    """ Returns the partial correlation of y and x, conditionning on
>>        confounds.
>>    """
>>    # First orthogonalise y and x relative to confounds
>>    if len(confounds):
>>        y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y))
>>        x = x - np.dot(confounds.T, linalg.lstsq(confounds.T, x))
>>    return np.dot(x, y)/sqrt(np.dot(y, y)*np.dot(x, x))
>>
>> I am not sure that what I am doing is optimal.
>>
>> > > Most of the time is spent in linalg.lstsq. The length of the vectors
>> > > is
>> > > 810, and there are about 10 confounds.
>>
>> > Exactly what are the shapes? y.shape = (810, N); confounds.shape = (810,
>> > 10)?
>>
>> Sorry, I should have been more precise:
>>
>> y.shape = (810, )
>> confounds.shape = (10, 810)
>>
>
> Column stack the bunch so that the last column is y, then do a qr
> decomposition. The last column of q is the (normalized) orthogonal vector
> and its amplitude is the last (bottom right) component of r.

do you have to do qr twice, once with x and once with y in the last
column or can this be combined?

I was trying to do something similar for partial autocorrelation for
timeseries but didn't manage or try anything better than repeated
leastsq or a variant.

>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

```