Dear scipy community I would like to propose to add isotonic regression to scipy [1] and I am very interested in your opinion. What is isotonic regression? "In monotone (or isotone) regression we fit an increasing or decreasing function to a set of points in the plane." [2] Isotonic regression is a nonparametric curve fitting technique consisting of 2 steps: 1. Solve the isotonic regression problem, see Eq. (2) in [2] or problem statement in [3]. Given a set of points y, it results in a set of points x of same length that is monotonically increasing. This step can be formulated as a constrained convex optimization problem and is usually solved by the pool adjacent violators algorithm (PAVA). 2. The isotonic solution points x are linearly interpolated to get a monotonically increasing nonparametric function approximation or a fit curve. What is it used for? As an old method (Brunk 1955, Ayers et al 1955, van Eeden 1958), isotonic regression has applications in many fields like operations research, signal processing, and statistics. To be more specific, it is used in dose-response curve fitting and calibration of ML models, in particular for probabilistic classifiers, see also [4]. Situation of the ecosystem? While isotonic regression is available in base R (isoreg in the stats package), there are at least 2 further high quality implementations available as R packages [5] and many further packages that depend on them. The only Python implementation I am aware of is the one in scikit-learn [6]. As a univariate regression technique, I personally would have never searched in scikit-learn and I guess that scikit-learn would happily use an implementation from scipy if available. Place in scipy? This brings me back to the start and the proposal to include it in scipy. Given a positive answer, the next question is where in scipy? I guess the options are scipy.interpolate and scipy.optimize (or, very bold, a new module scipy.curve_fitting). - As PAVA, in the end, is a convex optimisation problem, "pava" or a base "isotonic_regression" function could have it's place in scipy.optimize. - The functionality for curve fitting, say class "IsotonicRegression", could be place in either scipy.optimize or scipy.interpolate as both have curve fitting functionalities. I am looking forward to your insights and thoughts. All the best Christian Refereces [1] https://github.com/scipy/scipy/issues/17706 [2] de Leeuw, J., Hornik, K., & Mair, P. (2009). Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods. Journal of Statistical Software, 32(5), 1–24. https://doi.org/10.18637/jss.v032.i05 [3] https://en.wikipedia.org/wiki/Isotonic_regression [4] Busing, F. M. T. A. (2022). Monotone Regression: A Simple and Fast O(n) PAVA Implementation. Journal of Statistical Software, Code Snippets, 102(1), 1–25. https://doi.org/10.18637/jss.v102.c01 [5] - https://cran.r-project.org/package=monotone - https://cran.r-project.org/package=isotone - https://cran.r-project.org/package=cir [6] https://scikit-learn.org/stable/modules/generated/sklearn.isotonic.isotonic_... and https://scikit-learn.org/stable/modules/generated/sklearn.isotonic.IsotonicR...
Hi there This is a kind reminder that any feedback to the proposal to add non-parametric univariate monotonic regression/curve fitting to scipy, i.e. to add the pool-adjacent violators algorithm (PAVA). Any comment and opinion is still warmly welcome. As info, scipy already has several parametric curve fitting capabilities: - scipy.interpolate.make_lsq_spline - scipy.interpolate.make_smoothing_spline - scipy.interpolate.UnivariateSpline - scipy.interpolate.LSQUnivariateSpline - ... - scipy.optimize.curve_fit All the best Christian
participants (1)
-
Christian Lorentzen