[SciPy-User] StdErr Problem with Gary Strangman's linregress function

josef.pktd at gmail.com josef.pktd at gmail.com
Sun Jan 10 20:41:29 EST 2010


On Sun, Jan 10, 2010 at 8:21 PM, Bruce Southey <bsouthey at gmail.com> wrote:

>
>
> On Sun, Jan 10, 2010 at 3:35 PM, <totalbull at mac.com> wrote:
>
>>
>> Hello, Excel and scipy.stats.linregress are disagreeing on the standard
>> error of a regression.
>>
>> I need to find the standard errors of a bunch of regressions, and prefer
>> to use pure Python than RPy. So I am going to scipy.stats.linregress, as
>> advised at:
>>
>> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress
>>
>> from scipy import stats
>>
>> x = [5.05, 6.75, 3.21, 2.66]
>>
>> y = [1.65, 26.5, -5.93, 7.96]
>>
>> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y)
>>
>> gradient
>>
>> 5.3935773611970186
>>
>> intercept
>>
>> -16.281127993087829
>>
>> r_value
>>
>> 0.72443514211849758
>>
>> r_value**2
>>
>> 0.52480627513624778
>>
>> std_err
>>
>> 3.6290901222878866
>>
>>
>> The problem is that the std error calculation does not agree with what is
>> returned in Microsoft Excel's STEYX function (whereas all the other output
>> does). From Excel:
>>
>>
>>
>>
>> Anybody knows what's going on? Any alternative way of getting the standard
>> error without going to R?
>>
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
> The Excel help is rather cryptic by   :"Returns the standard error of the
> predicted y-value for each x in the regression. The standard error is a
> measure of the amount of error in the prediction of y for an individual x."
> But clearly this is not the same as the standard error of the 'gradient'
> (slope) returned by linregress. Without checking the formula, STEYX appears
> returns the square root what most people call the mean square error (MSE).
>
> Bruce
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>>> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y)
>>> ((y-intercept-np.array(x)*gradient)**2).sum()/(4.-2.)
136.80611125682617
>>> np.sqrt(_)
11.6964144615701

I think this should be the estimate of the standard deviation of the
noise/error term.

Josef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100110/ced2e698/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.tiff
Type: image/tiff
Size: 33948 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100110/ced2e698/attachment.tiff>


More information about the SciPy-User mailing list