Added Shuffled Complex Evolution (SCE) to SciPy.optimize
Dear all, Andrea Gavana reported results for global optimisation algorithms in 2013 (http://infinity77.net/global_optimization/) and in 2021 (http://infinity77.net/go_2021/). I had provided him with a version of the Shuffled Complex Evolution (SCE) optimiser (Duan et al., Water Resour Res 1992, doi: 10.1029/91WR02985). It performed exceptional well in the 2013 exercise and there was interest on the scipy-dev mailing list. While SCE did not score as high anymore in the 2021 exercise, it is still our first algorithm of choice for our hydrologic models and our land-surface models (written in Fortran and C). We routinely optimise more than 50 parameters (e.g. Cuntz et al., Water Resour Res 2015, doi: 10.1002/2015WR016907). So I implemented it in scipy.optimize, following very closely the implementation of differential evolution: https://github.com/scipy/scipy/pull/18436 However, the code is developed since 2013 so might have some “historic” code in it. But at least lots of bugs were removed since then ;-) I ran the global benchmark suite of scipy (results are in the PR). SCE performs reasonably well, comparable with differential evolution but needing only about 1/10-th of model runs. Dual Annealing and SHGO had higher success rates. But DA also needed about 10 times more model runs. SHGO had high success rates and needed little model evaluations. I will try this on our environmental models next. The implementation of SCE has some extras that we really like: - It can write out restart files for SCE. When optimising expensive models on computing clusters, it happens that the job used its allocated time but optimisation hasn’t finished yet. One can simply launch the job again setting `restart=True` in SCE. - Parameters can be sampled not only with a uniform distribution but also logarithmic. Sampling a parameter with a uniform distribution gives suboptimal solutions if the parameter can vary orders of magnitude. Imagine you have bounds [1e-9, 1e-3]. All numbers below 1e-4 are very unlikely if sampled uniformly. It is much better to sample the parameter in log-space then (e.g. Mai, J Hydrolo 2023, 10.1016/j.jhydrol.2023.129414). - One can mask parameters. We have to write config files for our environmental models (e.g. Fortran namelists), which need, of course, all parameters and not only the parameters to optimise. One might have determined already non-influential (insensitive) parameters and just wants to mask them from the optimisation. Or one wants to fix a parameter that is correlated with another parameter. - args and kwargs can be passed to the objective function. Thanks in advance for your feedback, Matthias
participants (1)
-
Matthias Cuntz