Re: [SciPy-Dev] Global Optimization Benchmarks

July 26, 2020

      Hi Andrea,

Those benchmarks also relied on the global optimum as a stopping criterion
plus a maximum number of objective function evaluations (2,000), whichever
is reached first. Of course, reaching the maximum number of function
evaluations without getting to the global optimum (plus a pre-specified
tolerance) means a failure in the context of my original benchmarks.

These criteria can be set with the following arguments:

options = {'maxfev': 2000,
           'f_min': f_min,
           'f_tol': f_tol}

result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options)

The algorithm should iterate (1 iteration is: check stopping criteria -->
sample `n` points --> triangulate --> find minimisers) until a global
minimum within the specified tolerance is found or the number of function
evaluations run out. A lower number of sampling points per iteration tends
to show higher performance, but I know that there is a strange bug where
the algorithm keeps running within the same attractor even if this is
supposed to be added to the triangulation, so a higher`n` will work better
on some test suites until this is fixed.

Most people tend to use the non-iterative version of the algorithm which is
the most stable so there might be a few bugs running it iteratively in
general.

I also have a slightly different approach to bounds/global optima locations
so that algorithms that rely on guessing global optima by running to the
center of the domain (or on the bounds) will have a less easy life this
time

I would also recommend looking into using GKLS generators which, to my
understanding, is standard practice to avoid this kind of bias (
https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO functions.

Bounds-shifting is what I used to do but you have to be careful as some of
the benchmark functions can be undefined outside those bounds (I.e.,
returning NaNs) or they can have lower global optima outside the bounds.

At least for SHGO the NaN values (and other non-floating point objects)
should not be a problem, it was partially developed to deal with
discontinuities in the objective function. Returning a lower value might be
an issue. SHGO is not supposed to be able to escape the bounds, but if the
bounds fed to it are outside the actual bounds I think a strong penalty
function would need to be added to the objective function or the algorithm
will terminate earlier with the lower optimum.

Another reason why I would recommend adding randomness to the objective
functions (using something like GKLS) instead of the hyperparameters/bounds
is that the sequences might lose their low-discrepancy properties
with different seeds, but I am not an expert on how adjusting those
sequences affects performance.

I didn’t know that the sampling process of SHGO relied on random numbers

The Sobol sequence is technically quasi random, but a true random number
generator can also be used by specifying a random sampling function to the
sample_method argument, the main difference being that most RNGs are biased
towards the centre of the hypercube. On the other hand default sampling
behaviour relies on sub-triangulations of the hyperrectangle (similar to
DIRECT), but the performance is almost identical to the Sobol sequence.
Sub-triangulations are biased towards boundaries and centres depending on
if the vertices or centroids are used in the actual sampling so I would not
recommend this for your benchmarks.

Best regards,

Stefan Endres

On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana <andrea.gavana@gmail.com>
wrote:
...
Hi Stefan,
On Sun, 26 Jul 2020 at 17.19, Stefan Endres <stefan.c.endres@gmail.com>
wrote:
...
Dear Andrea,
SHGO does not use an initial starting point, only the bounds (which may
also be specified as none or infinite). The benchmarks that I ran used for
the publication used the global minimum as a stopping criteria (together
with performance profiles that demonstrate the final results). For this
particular benchmarking framework I would propose simply using a single
iteration ((dim)^2 +1 points) or specifying 100 starting points.
A script to use 100 sampling points in a single iteration with the sobol
sampling method:
``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ```
If you would like to add a more stochastic element to this performance I
think the best approach would be to use a different seed for the sampling
method (in my experience this does not make much of a difference to the
performance in low dimensional problems), otherwise run shgo only once
and/or with increasing numbers of iterations. Another possibility is to add
a stochastic element to the bounds.
Please let me know if you need any help.
Thank you for your answer. The approach I had several years ago - and that
I’d like to keep - was to generate 100 random starting points for each
benchmark and run all the global optimizers from that point: see
http://infinity77.net/global_optimization/
Those benchmarks also relied on the global optimum as a stopping criterion
plus a maximum number of objective function evaluations (2,000), whichever
is reached first. Of course, reaching the maximum number of function
evaluations without getting to the global optimum (plus a pre-specified
tolerance) means a failure in the context of my original benchmarks.
I have now a few more benchmark functions plus a couple of new algorithms
and I’d like to take the same steps. I also have a slightly different
approach to bounds/global optima locations so that algorithms that rely on
guessing global optima by running to the center of the domain (or on the
bounds) will have a less easy life this time. Bounds-shifting is what I
used to do but you have to be careful as some of the benchmark functions
can be undefined outside those bounds (I.e., returning NaNs) or they can
have lower global optima outside the bounds. Shrinking the bounds is of
course always a possibility but it makes life easier to the algorithms and
it will fail as a strategy if a benchmark has a global optimum exactly at
(one or more of) the original bounds.
That said, I didn’t know that the sampling process of SHGO relied on
random numbers: that is good to know, as an alternative I can do as you
suggested and vary the seed 100 times - one of the new algorithms I have
also does not use an initial point so it was already my strategy to change
the seed for that one. I can simply do the same for SHGO.
I’m running still with an old Python/Numpy/SciPy combination (for legacy
reasons) so I’ll have to see if differential_evolution and dual_annealing
can be simply copied over locally and run - I tested SHGO and it runs with
no problem.
Andrea.
...
Best regards,
Stefan Endres
On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana <andrea.gavana@gmail.com>
wrote:
...
Dear SciPy developers & users,
I have a couple of new derivative-free, global optimization
algorithms I’ve been working on lately - plus some improvements to AMPGO
and a few more benchmark functions - and I’d like to rerun the benchmarks
as I did back in 2013 (!!!).
In doing so, I’d like to remove some of the least interesting/worst
performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace
them with the ones currently available in SciPy - differential_evolution,
SHGO and dual_annealing.
Everything seems good and dandy, but it appears to me that SHGO does not
accept an initial point for the optimization process - which makes the
whole “run the optimization from 100 different starting points for each
benchmark” a bit moot.
I am no expert on SHGO, so maybe there is an alternative way to
“simulate” the changing of the starting point for the optimization? Or
maybe some other approach to make it consistent across optimizers?
Any suggestion is more than welcome.
Andrea.
_______________________________________________
...
SciPy-Dev mailing list
SciPy-Dev@python.org
https://mail.python.org/mailman/listinfo/scipy-dev
--
Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials
Engineering IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
<https://www.google.com/maps/search/Badgasteiner+Stra%C3%9Fe+3,+28359+Bremen,+Germany?entry=gmail&source=g>
Work phone (DE): +49 (0) 421 218 51238
Cellphone (DE): +49 (0) 160 949 86417
Cellphone (ZA): +27 (0) 82 972 42 89
E-mail (work): s.endres@iwt.uni-bremen.de
Website: https://stefan-endres.github.io/
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@python.org
https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@python.org
https://mail.python.org/mailman/listinfo/scipy-dev
-- 
Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)

Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering
IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
Work phone (DE): +49 (0) 421 218 51238
Cellphone (DE): +49 (0) 160 949 86417
Cellphone (ZA): +27 (0) 82 972 42 89
E-mail (work): s.endres@iwt.uni-bremen.de
Website: https://stefan-endres.github.io/