[Python-checkins] Eliminate duplicated calculations and unnecessary work for linear regression (GH-25922) (GH-25945)
rhettinger
webhook-mailer at python.org
Thu May 6 11:27:21 EDT 2021
https://github.com/python/cpython/commit/8e3cb61da9981847d5ac846f32f817c8dbfbeef3
commit: 8e3cb61da9981847d5ac846f32f817c8dbfbeef3
branch: 3.10
author: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
committer: rhettinger <rhettinger at users.noreply.github.com>
date: 2021-05-06T08:26:55-07:00
summary:
Eliminate duplicated calculations and unnecessary work for linear regression (GH-25922) (GH-25945)
files:
M Lib/statistics.py
diff --git a/Lib/statistics.py b/Lib/statistics.py
index edb11c868c1c8..db8c581068b7d 100644
--- a/Lib/statistics.py
+++ b/Lib/statistics.py
@@ -952,11 +952,16 @@ def linear_regression(regressor, dependent_variable, /):
raise StatisticsError('linear regression requires that both inputs have same number of data points')
if n < 2:
raise StatisticsError('linear regression requires at least two data points')
+ x, y = regressor, dependent_variable
+ xbar = fsum(x) / n
+ ybar = fsum(y) / n
+ sxy = fsum((xi - xbar) * (yi - ybar) for xi, yi in zip(x, y))
+ s2x = fsum((xi - xbar) ** 2.0 for xi in x)
try:
- slope = covariance(regressor, dependent_variable) / variance(regressor)
+ slope = sxy / s2x
except ZeroDivisionError:
raise StatisticsError('regressor is constant')
- intercept = fmean(dependent_variable) - slope * fmean(regressor)
+ intercept = ybar - slope * xbar
return LinearRegression(intercept=intercept, slope=slope)
More information about the Python-checkins
mailing list