[New-bugs-announce] [issue44151] Improve parameter names and return value ordering for linear_regression

Raymond Hettinger report at bugs.python.org
Sun May 16 15:48:29 EDT 2021

New submission from Raymond Hettinger <raymond.hettinger at gmail.com>:

The current signature is:

    linear_regression(regressor, dependent_variable)

While the term "regressor" is used in some problem domains, it isn't well known outside of those domains.   The term "independent_variable" would be better because it is common to all domains and because it is the natural counterpart to "dependent_variable".

Another issue is that the return value is a named tuple in the form:

    LinearRegression(intercept, slope)

While that order is seen in multiple linear regression, most people first learn it in algebra as the slope/intercept form:  y = mx + b.   That will be the natural order for a majority of users, especially given that we aren't supporting multiple linear regression.

The named tuple is called LinearRegression which describes how the result was obtained rather than the result itself.  The output of any process that fits data to a line is a line.  The named tuple should be called Line because that is what it describes.  Also, a Line class would be reusuable for other purposes that linear regression.

Proposed signature:

  linear_regression(independent_variable, dependent_variable) -> Line(slope, intercept)

components: Library (Lib)
messages: 393754
nosy: pablogsal, rhettinger, steven.daprano
priority: normal
severity: normal
status: open
title: Improve parameter names and return value ordering for linear_regression
type: behavior
versions: Python 3.10, Python 3.11

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list