On Wed, Aug 16, 2023, 22:23 Karen Róbertsdóttir <karen.robertsdottir@gmail.com> wrote:

Perfectly fine!  But, question:

> The strategy_func would be
> responsible for mutating (blending) members of the population together,
> doing the crossover/recombination itself, and returning a trial vector with
> shape (N,).

So it wouldn't be told what other candidate to perform recombination with - it should pick recombination targets itself?  I mean, that's workable, just being clear on this.


The strategy function would be solely responsible for creating a trial vector. It could do anything it wanted, so long as it returned a trial vector the same shape as the problem description. Whether the strategy function was sensible would be solely at the discretion of the user. The fitness of the trial function is determined outside the strategy function.
 
I can't speak for others, but I'm fine with receiving it in the [0, 1] range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit.  But whatever your preference is works for me.

On further reflection it makes sense to supply the population in the [bounds.lb, bounds.ub range]. It would be a copy of the population, so the original array wasn't overwritten by the user.


(As a side note, Scipy's inability to save and resume the population during differential_evolution optimization used to be really annoying, given how long CFD optimization tasks take.  However, I did find a kind of cheap hack that I've been using ever since - since the random number generator is deterministic, I simply have the minimization function create a hash value for the candidate, and store the results of the simulation in a hash table, which I save to disk.  Then when I need to resume, I just load up the hash table, and if a candidate has been encountered before, it just immediately returns the previous run's simulation results rather than re-running the simulation. It's an awkward hack, and wouldn't work on tasks where the minimization function is really fast, but for slow tasks like CFD, it works  :)  )

If you use the DifferentialEvolutionSolver (warning, it's private and subject to change), then you can step the solver very easily, it's an iterator. For a single run through it should be very rare that the fitness of a given vector is evaluated twice, only trial vectors are evaluated, and they're always novel. If you want to stop/restart then I understand the need to cache. TBH using very expensive objective functions doesn't sound great for differential_evolution though, there's always a lot of function evaluations.

Custom strategy_funcs can be of course used for things that have nothing to do with genes. For example: sometimes - as the docs note - a user may want part of their candidates' data to be interpreted as integer data. How do you mutate or crossbreed integers and have them make sense?  Well, that's really going to be task-dependent. Maybe the integer means "number of iterations" - if so, then perhaps simple interpolation is best.  But maybe it's a category - in that case, interpolation is incoherent, and you should either keep it the same or randomly pick a new category.  And if it is a category, and that category influences some other values in the candidate, then that may affect how you want to alter those values.  Maybe if round(candidate[0]) is Category == 3, then you want the floating point value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5 then maybe you want candidate[1] to be between 1.0 and 5.0.  Again, it's task dependent.

`differential_evolution` already has an integrality keyword. 

w.r.t implementation, it just needs someone to do the programming and write tests. We always welcome new contributors for PRs.

A.