optimize: what should happen if objective functions return non-finite numbers?
Consider the following example which raises an AssertionError: import numpy as np from scipy.optimize import minimize def func1(x): return np.nan x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2]) res = minimize(func1, x0, method='l-bfgs-b') assert(res.success = False) minimize simply returns the starting values of: res.x == x0. The reason I came up with this example is that unsanitised datasets sometimes contain nan or inf. Thus, if func1 was calculating chi2 and you were using minimize then the entire fit would appear to succeed (res.success is True), but the output would be garbage. Ok, so res.message is CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL, but it's not an clear indicator that something went wrong. A second example is: import numpy as np from scipy.optimize import curve_fit def func2(x, a, b, c): return a * np.exp(-b * x) + c def func3(x, a, b, c): return np.nan xdata = np.linspace(0, 4, 50) y = func2(xdata, 2.5, 1.3, 0.5) ydata = y + 0.2 * np.random.normal(size=len(xdata)) popt, pcov = curve_fit(func3, xdata, ydata) print(popt) Whilst there is a warning (OptimizeWarning: Covariance of the parameters could not be estimated) it's not a clear indicator that something has gone wrong. The behaviour one might expect in both examples could be to see a ValueError raised if there arenp.nan values returned from the objective function. I'm not totally sure of what to do if +/- np.inf is returned (-inf would be a very good global minimum). -- _____________________________________ Dr. Andrew Nelson _____________________________________
On Tue, Jun 14, 2016 at 7:19 PM, Andrew Nelson <andyfaff@gmail.com> wrote:
Consider the following example which raises an AssertionError:
import numpy as np from scipy.optimize import minimize def func1(x): return np.nan x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2]) res = minimize(func1, x0, method='l-bfgs-b') assert(res.success = False)
minimize simply returns the starting values of: res.x == x0. The reason I came up with this example is that unsanitised datasets sometimes contain nan or inf. Thus, if func1 was calculating chi2 and you were using minimize then the entire fit would appear to succeed (res.success is True), but the output would be garbage. Ok, so res.message is CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL, but it's not an clear indicator that something went wrong. A second example is:
import numpy as np from scipy.optimize import curve_fit def func2(x, a, b, c): return a * np.exp(-b * x) + c
def func3(x, a, b, c): return np.nan
xdata = np.linspace(0, 4, 50) y = func2(xdata, 2.5, 1.3, 0.5) ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = curve_fit(func3, xdata, ydata) print(popt)
Whilst there is a warning (OptimizeWarning: Covariance of the parameters could not be estimated) it's not a clear indicator that something has gone wrong. The behaviour one might expect in both examples could be to see a ValueError raised if there arenp.nan values returned from the objective function. I'm not totally sure of what to do if +/- np.inf is returned (-inf would be a very good global minimum).
In my opinion optimizers should not (never?) raise exception, warn and return whatever is available so the user can investigate. I'm seeing nans every once in a while, but even if the objective function returns nan, we often have finite parameters that can be used to investigate for example gradients and similar. In statsmodels we just got a bug report for NegativeBinomial/Poisson similar to the exp example that had nans because of overflow. I was surprised that converged=True showed up in that case (but disp and our summary shows the nans). about nan in the objective functions: A few years ago I played with several examples where I put a segment in the parameter space where the objective function returned nan . Several of the optimizers managed to avoid that region, AFAIR. In the case of optimizers, the user can always put additional checks into the objective function, and raise there if desired. I tried to convert nans to some proper values, but, AFAIR, the behavior of different optimizers varies widely and I didn't find a solution that would work in general. aside: bfgs was recently changed so it should have fewer problems with extreme stepsizes as in the exp example. I still haven't tried out the trust-region newton methods that have been added a while ago with the statsmodels optimization problems. Josef
-- _____________________________________ Dr. Andrew Nelson
_____________________________________
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org https://mail.scipy.org/mailman/listinfo/scipy-dev
On Tue, Jun 14, 2016 at 4:19 PM, Andrew Nelson <andyfaff@gmail.com> wrote:
The behaviour one might expect in both examples could be to see a ValueError raised if there arenp.nan values returned from the objective function. I'm not totally sure of what to do if +/- np.inf is returned (-inf would be a very good global minimum).
It's not uncommon get pathological objective function values when you optimize complex models (e.g., that integrate an ode). I agree that the sane default behavior is to raise a ValueError when NaN is encountered, but sometimes it's better to treat such data points as invalid. The exact strategy would depend on the optimizer, but I think +np.inf is a perfectly reasonable sentinel value to use to indicate that function value is "bad" in a generic way. It would be nice if the scipy optimizers handled this consistently.
Wed, 15 Jun 2016 09:19:45 +1000, Andrew Nelson kirjoitti:
Consider the following example which raises an AssertionError:
import numpy as np from scipy.optimize import minimize def func1(x): return np.nan x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2]) res = minimize(func1, x0, method='l-bfgs-b') assert(res.success = False)
The design decision so far has been to not communicate this type of failure with exceptions. Rather, the optimizer should assume the point is out of the domain of the objective function. Whether the optimizer can continue, depends on the method. If it cannot, it should return with success=False, and an appropriate error message set.
minimize simply returns the starting values of: res.x == x0. The reason I came up with this example is that unsanitised datasets sometimes contain nan or inf. Thus, if func1 was calculating chi2 and you were using minimize then the entire fit would appear to succeed (res.success is True), but the output would be garbage.
Why would res.success be true? If it is true when it is clear local minimum is not reached, that is a bug.
import numpy as np from scipy.optimize import curve_fit def func2(x, a, b, c): return a * np.exp(-b * x) + c
def func3(x, a, b, c): return np.nan
xdata = np.linspace(0, 4, 50) y = func2(xdata, 2.5, 1.3, 0.5) ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = curve_fit(func3, xdata, ydata) print(popt)
Whilst there is a warning (OptimizeWarning: Covariance of the parameters could not be estimated) it's not a clear indicator that something has gone wrong. The behaviour one might expect in both examples could be to see a ValueError raised if there arenp.nan values returned from the objective function. I'm not totally sure of what to do if +/- np.inf is returned (-inf would be a very good global minimum).
I think this is a different issue --- the only way curve_fit communicates fit failures in general is via setting the estimated covariances to infinity. This is not so convenient, but that's how it works currently. -- Pauli Virtanen
Wed, 15 Jun 2016 18:53:22 +0000, Pauli Virtanen kirjoitti: [clip]
I think this is a different issue --- the only way curve_fit communicates fit failures in general is via setting the estimated covariances to infinity. This is not so convenient, but that's how it works currently.
Sorry, this was wrong --- it's supposed to raise a RuntimeError, iow., works differently from minimize(). However, nan values confuse leastsq() to think it converged --- this can and should be fixed.
I think this is a different issue --- the only way curve_fit communicates fit failures in general is via setting the estimated covariances to infinity. This is not so convenient, but that's how it works currently.
Sorry, this was wrong --- it's supposed to raise a RuntimeError, iow., works differently from minimize().
However, nan values confuse leastsq() to think it converged --- this can and should be fixed.
To clarify here - is the expected behaviour that curve_fit and leastsq raise a RuntimeError if the user function returns NaN? What about np.inf, -np.inf? Note: least_squares raises a ValueError if the "Residuals are not finite in the initial point".
Thu, 16 Jun 2016 13:36:39 +1000, Andrew Nelson kirjoitti:
To clarify here - is the expected behaviour that curve_fit and leastsq raise a RuntimeError if the user function returns NaN? What about np.inf, -np.inf? Note: least_squares raises a ValueError if the "Residuals are not finite in the initial point".
Returning NaN in some part of the parameter space does not necessarily mean that the algorithm cannot find a local minimum somewhere else. This is why raising an error immediately if NaN is returned is not necessarily correct. The same for infinities. However, returning NaN may mean that the algorithm cannot continue. This can immediately lead to a convergence failure (the algorithm doesn't know how to continue), but it should be reported in the same way as the algorithm reports all other types of convergence failure. Specifically, you should look at how the different functions handle other error conditions. This is rather varying due to historical reasons: * curve_fit raises a RuntimeError * minimize() returns with success==False * leastsq() raises warnings or errors if full_output==False, or returns with an error code set if full_output==True
On 16 June 2016 at 04:53, Pauli Virtanen <pav@iki.fi> wrote:
Wed, 15 Jun 2016 09:19:45 +1000, Andrew Nelson kirjoitti:
Consider the following example which raises an AssertionError:
import numpy as np from scipy.optimize import minimize def func1(x): return np.nan x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2]) res = minimize(func1, x0, method='l-bfgs-b') assert(res.success = False)
<SNIP>
Whether the optimizer can continue, depends on the method. If it cannot, it should return with success=False, and an appropriate error message set.
minimize simply returns the starting values of: res.x == x0. The reason I came up with this example is that unsanitised datasets sometimes contain nan or inf. Thus, if func1 was calculating chi2 and you were using minimize then the entire fit would appear to succeed (res.success is True), but the output would be garbage.
Why would res.success be true? If it is true when it is clear local minimum is not reached, that is a bug.
In the example above res.success is True. On further inspection both 'L-BFGS-B' and 'CG' give res.success is True, the rest give res.success is False. It's interesting that when I try this with 'COBYLA', then the minimize function never returns.
participants (4)
-
Andrew Nelson
-
josef.pktd@gmail.com
-
Pauli Virtanen
-
Stephan Hoyer