minimize reporting successful but not searching (with default method)
I have a situation where scipy.optimize.minimize returns the initial starting value as the answer without really iterating, and returns "successful" status, using default method. It works fine if I specify Nelder-Mead, but my question is "why does minimize return successful when it isn's, and what can I do (if anything) to flag this behavior? I originally posted in scipy-user but decided it may be more appropriate here. The question is also here http://stackoverflow.com/q/36110998 and if someone can also leave an answer there as well, that would be great! but not necessary. One particular - I'm using calls to Skyfield, which may or may not have something to do with it. (http://rhodesmill.org/skyfield/) FULL OUTPUT using DEFAULT METHOD: status: 0 success: True njev: 1 nfev: 3 hess_inv: array([[1]]) fun: 1694.98753895812 x: array([ 10000.]) message: 'Optimization terminated successfully.' jac: array([ 0.]) nit: 0 FULL OUTPUT using Nelder-Mead METHOD: status: 0 nfev: 63 success: True fun: 3.2179306044608054 x: array([ 13053.81011963]) message: 'Optimization terminated successfully.' nit: 28
Mon, 21 Mar 2016 23:54:13 +0800, David Mikolas kirjoitti:
It works fine if I specify Nelder-Mead, but my question is "why does minimize return successful when it isn's
Your function is piecewise constant ("staircase" pattern on a small scale). The derivative at the initial point is zero. The optimization method is a local optimizer. Therefore: the initial point is a local optimum (in the sense understood by the optimizer i.e. satisfying the KKT conditions). *** As your function is not differentiable, the KKT conditions don't mean much. Note that it works with Nelder-Mead is sort of a coincidence --- it works because the "staircase" pattern of the function is on a scale smaller than the initial simplex size. On the other hand, it fails for BFGS, because the default step size used for numerical differentiation happens to be smaller than the size of the "staircase" size.
and what can I do (if anything) to flag this behavior?
Know whether your function is differentiable or not, and which optimization methods are expected to work. There's no cheap general way of detecting --- based on the numerical floating-point values output by a function --- whether a function is everywhere differentiable or continuous. This information must come from the author of the function, and optimization methods can only assume it. You can however make sanity checks, e.g., using different initial values to see if you're stuck at a local minimum. If the function is not differentiable, optimizers that use derivatives, even numerically approximated, can be unreliable. If the non-differentiability is on a small scale, you can cheat and choose a numerical differentiation step size large enough --- and hope for the best. -- Pauli Virtanen
If you have a rough idea of lower/upper bounds for the parameters then you could use differential_evolution, it uses a stochastic rather than gradient approach. It will require many more function evaluations though. On 22 March 2016 at 08:39, Pauli Virtanen <pav@iki.fi> wrote:
Mon, 21 Mar 2016 23:54:13 +0800, David Mikolas kirjoitti:
It works fine if I specify Nelder-Mead, but my question is "why does minimize return successful when it isn's
Your function is piecewise constant ("staircase" pattern on a small scale).
The derivative at the initial point is zero.
The optimization method is a local optimizer.
Therefore: the initial point is a local optimum (in the sense understood by the optimizer i.e. satisfying the KKT conditions).
***
As your function is not differentiable, the KKT conditions don't mean much.
Note that it works with Nelder-Mead is sort of a coincidence --- it works because the "staircase" pattern of the function is on a scale smaller than the initial simplex size.
On the other hand, it fails for BFGS, because the default step size used for numerical differentiation happens to be smaller than the size of the "staircase" size.
and what can I do (if anything) to flag this behavior?
Know whether your function is differentiable or not, and which optimization methods are expected to work.
There's no cheap general way of detecting --- based on the numerical floating-point values output by a function --- whether a function is everywhere differentiable or continuous. This information must come from the author of the function, and optimization methods can only assume it.
You can however make sanity checks, e.g., using different initial values to see if you're stuck at a local minimum.
If the function is not differentiable, optimizers that use derivatives, even numerically approximated, can be unreliable.
If the non-differentiability is on a small scale, you can cheat and choose a numerical differentiation step size large enough --- and hope for the best.
-- Pauli Virtanen
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org https://mail.scipy.org/mailman/listinfo/scipy-dev
-- _____________________________________ Dr. Andrew Nelson _____________________________________
Andrew, Differential evolution is new to me and therefore by definition interesting, thanks! In this particular case I definitely want the local minimum (this eclipse), and calls are expensive - there's a millisecond delay for each new instance of Julian Date arrays http://stackoverflow.com/q/35358401 But if I access the database directly, then this could be very interesting to try to search for new events! On Tue, Mar 22, 2016 at 9:00 AM, Andrew Nelson <andyfaff@gmail.com> wrote:
If you have a rough idea of lower/upper bounds for the parameters then you could use differential_evolution, it uses a stochastic rather than gradient approach. It will require many more function evaluations though.
On 22 March 2016 at 08:39, Pauli Virtanen <pav@iki.fi> wrote:
Mon, 21 Mar 2016 23:54:13 +0800, David Mikolas kirjoitti:
It works fine if I specify Nelder-Mead, but my question is "why does minimize return successful when it isn's
Your function is piecewise constant ("staircase" pattern on a small scale).
The derivative at the initial point is zero.
The optimization method is a local optimizer.
Therefore: the initial point is a local optimum (in the sense understood by the optimizer i.e. satisfying the KKT conditions).
***
As your function is not differentiable, the KKT conditions don't mean much.
Note that it works with Nelder-Mead is sort of a coincidence --- it works because the "staircase" pattern of the function is on a scale smaller than the initial simplex size.
On the other hand, it fails for BFGS, because the default step size used for numerical differentiation happens to be smaller than the size of the "staircase" size.
and what can I do (if anything) to flag this behavior?
Know whether your function is differentiable or not, and which optimization methods are expected to work.
There's no cheap general way of detecting --- based on the numerical floating-point values output by a function --- whether a function is everywhere differentiable or continuous. This information must come from the author of the function, and optimization methods can only assume it.
You can however make sanity checks, e.g., using different initial values to see if you're stuck at a local minimum.
If the function is not differentiable, optimizers that use derivatives, even numerically approximated, can be unreliable.
If the non-differentiability is on a small scale, you can cheat and choose a numerical differentiation step size large enough --- and hope for the best.
-- Pauli Virtanen
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org https://mail.scipy.org/mailman/listinfo/scipy-dev
-- _____________________________________ Dr. Andrew Nelson
_____________________________________
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org https://mail.scipy.org/mailman/listinfo/scipy-dev
Pauli thank you for taking the time to explain clearly and thoroughly. I've updated the question in stackexchange by removing words like "wrong" and "fail", and I've added a supplementary answer with a plot of the staircase behavior of the JulianDate method. http://stackoverflow.com/a/36144582 http://i.stack.imgur.com/CUuia.png That's 40 microseconds on a span of thousands of years. On Tue, Mar 22, 2016 at 5:39 AM, Pauli Virtanen <pav@iki.fi> wrote:
Mon, 21 Mar 2016 23:54:13 +0800, David Mikolas kirjoitti:
It works fine if I specify Nelder-Mead, but my question is "why does minimize return successful when it isn's
Your function is piecewise constant ("staircase" pattern on a small scale).
The derivative at the initial point is zero.
The optimization method is a local optimizer.
Therefore: the initial point is a local optimum (in the sense understood by the optimizer i.e. satisfying the KKT conditions).
***
As your function is not differentiable, the KKT conditions don't mean much.
Note that it works with Nelder-Mead is sort of a coincidence --- it works because the "staircase" pattern of the function is on a scale smaller than the initial simplex size.
On the other hand, it fails for BFGS, because the default step size used for numerical differentiation happens to be smaller than the size of the "staircase" size.
and what can I do (if anything) to flag this behavior?
Know whether your function is differentiable or not, and which optimization methods are expected to work.
There's no cheap general way of detecting --- based on the numerical floating-point values output by a function --- whether a function is everywhere differentiable or continuous. This information must come from the author of the function, and optimization methods can only assume it.
You can however make sanity checks, e.g., using different initial values to see if you're stuck at a local minimum.
If the function is not differentiable, optimizers that use derivatives, even numerically approximated, can be unreliable.
If the non-differentiability is on a small scale, you can cheat and choose a numerical differentiation step size large enough --- and hope for the best.
-- Pauli Virtanen
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org https://mail.scipy.org/mailman/listinfo/scipy-dev
participants (3)
-
Andrew Nelson -
David Mikolas -
Pauli Virtanen