Thanks for the response! Sorry for any misunderstandings I might have had
about the current internal architecture :)
> The callable function would be completely
> responsible for generating trial vectors, i.e. doing the mutation AND
> recombination.
No issues with that!
> A possible call signature would be `strategy_func(candidate, population,
> rng=None)`
Perfectly fine! But, question:
> The strategy_func would be
> responsible for mutating (blending) members of the population together,
> doing the crossover/recombination itself, and returning a trial vector
with
> shape (N,).
So it wouldn't be told what other candidate to perform recombination with -
it should pick recombination targets itself? I mean, that's workable, just
being clear on this.
> Note that all entries in population are currently numbers in the range [0,
> 1], and are scaled to 'actual values' using the lower and upper bounds. If
> such a feature is added it would be reasonable for the strategy_func to
> receive population values in their actual range, i.e. scaled from [0, 1]
to
> [lowerlim, upperlim], sent to strategy_func, trial returned from func is
> then scaled back to [0, 1]. The to/from scaling would add some overhead.
I can't speak for others, but I'm fine with receiving it in the [0, 1]
range and then scaling it myself, to avoid the need for the stock functions
to take that slight overhead hit. But whatever your preference is works
for me.
> It's unclear what you meant by test functions returning a string of 1s ..
> There are no strings anywhere.
Bad phrasing. "returning a np.array of 1.0s". I wasn't referring to a
literal string datatype.
An example would be something like:
----------
def strategy_func(candidate, population, rng=None):
return np.ones(len(candidate))
def minimization_function(x):
if np.all(x == 1.0):
print("Test passed")
sys.exit()
else:
return random.random()
----------
... which would be run with a low population size, ideally just one
candidate. So if it's working right, strategy_func returns a candidate
that's all 1.0, the minimization function sees it, and passes the test.
> If such a feature was added it would be limited to generation of the trial
> vector only.
Correct. My apologies if my phrasing sounded otherwise.
> It's not quite clear to me what your strategy function looks like, it'd be
> interesting to see an example.
Let me first describe an example from nature, before describing my case.
The human eye has three different types of cones used for colour vision.
These are based on different opsin proteins, each of which has specific
genes responsible for their production. In particular, the M and L (green
and red) cone opsin genes are extremely similar, 96% identical. This
probably arose due to a tandem duplication - two copies of the gene arise,
one after the other. When this occurs, the initial impact is simply
increasing the amount of the original opsin protein production. But now
there's two separate genes, and they can drift independently without one
affecting the other. So now one drifts toward shorter wavelengths and the
other drifts toward longer wavelengths. Now the organism has the ability
to distinguish red from green, and can now see the difference between, say,
ripe and unripe fruit, or between a snake and a vine - conferring a
significant survival advantage!
This couldn't have been possible without the tandem duplication event,
without one gene being able to turn into two. This cannot happen in scipy
as it exists now. But with a custom strategy_func, it could. If the coder
were representing "genes" in whatever their project is as clusters of
floating point numbers, they know how they're representing that data, and
can include gene duplication events, gene migration events, and so forth.
Now, as for my specific example at present (note: this is not the first
time I've wanted such a feature - I use scipy.optimize.minimize a lot, I
love it! - but it's the first time I've bothered to get on the mailing
list :) ):
One type of project I've used scipy.optimize.differential_evolution for
many times is developing 3d models for CFD (Computational Fluid Dynamics)
simulations - that is, to evolve optimal shapes for given tasks. Let's say
you wanted to evolve a wing - a candidate array might be of the format:
[wingspan, chord, thickness, taper_ratio, twist_angle, incidence_angle,
dihedral_angle, sweep_angle .... ]
You'd write a function to generate a 3d wing model from those set of
parameters, and then in your minimization function, you generate the model,
pass it off to e.g. OpenFOAM to run the simulation, save the forces from
the output, and then return a value based on those functions that
represents the performance of your wing.
All well and good. And this is what I've done every time so far. But
there's a couple problems. The first is obviously that the mesh can only be
altered along the specific ways you design it to be altered - it can't
invent something innovative. And secondly, writing a function to generate
a mesh from parameters can be surprisingly time-consuming and challenging.
Including the fact that if the mesh happens to accidentally self-intersect,
it'll generate an aphysical model, and then the simulation can do all sorts
of crazy things. Every time you start a new project, you have to write a
new parameterized model-generation function from scratch.
So this time around I decided to try something new: I want to make a
*generic* optimizer. Where there's no hard-coded model-generation function
at all for each task - where you can just provide an initial guess model,
and it can change it at will. Where one only has to provide constraints
and objectives for the simulation on which the results will be evaluated.
So then the question comes: how do you represent such a 3d model as a
candidate for evolution?
A naive approach would be, "well, a mesh is vertices and faces, so let's
just list the vertices, then list their face indices, and call that good."
But more than a couple minutes thought shows that this is a disastrous
idea. Indices are integers. Mutating from one index to another is a
nonsensical change, and usually one that will make a broken mesh. And
crossbreeding/recombining meshes also makes no sense - even crossing
vertices will often yield incoherent results, let alone crossbreeding face
indices!
Instead, I settled on an incremental generative approach. The first 14
floating point parameters of the candidate describe an initial anchor face
for the model, and then each subsequent group of 11 floating point
parameters describes a new face to be incrementally added to an edge of the
model (each of these groups of 11 parameters can be thought of as a gene).
That is to say, each edge in the model has an edge anchor ID associated
with it, and all still-available edges on which the model can grow on are
in an unused_edge_id list. So for each gene (e.g. each new face), it
searches through unused_edge_id list, finds the one that most closely
matches its anchor ID, and builds itself there. To build a new face, it
grows a vector a fixed distance in-plane out of the midpoint of the edge
it's being attached to, and that vector is then rotated in-plane (e.g.
along the root face's normal) and out-of-plane (e.g. along the attachment
edge), to find where to add a new vertex. If the new vertex is close
enough to an existing vertex, it snaps to that vertex (allowing the model
to create closed shapes) - otherwise it adds a new vertex in place. A face
is then formed from the preexisting edge to the new vertex (or preexisting
vertex, in the case of snapping), any formerly unused edges are marked as
used, and any newly added edges have their anchor IDs added to the
unused_edge_id list.
One of the 11 parameters for each gene is replication_count, so that a
model can evolve to make many copies of a given added face (each picking
the most similar remaining anchor ID, as per before).
Once all the "genes" in the candidate have been processed, the mesh is then
run through bpy to use Blender to extrude it by a given thickness (based on
each face's material properties, which are also among the 11 floating point
parameters), and self-collision checks are also performed. Any defective
models return from the minimization function immediately (returning an
extremely high value, as a failure). Otherwise the model then goes on to
OpenFOAM for a CFD simulation as usual.
(Current status: model generation seems nearly debugged, at least for
simple models. Haven't gotten into bpy extrusion or self-collision tests
yet. But I figured I should open this conversation re: scipy changes now
because they might take some time).
Given the above, you can see how gene duplication (e.g. copying a cluster
of 11 floating point parameters and overwriting a different one) followed
by genetic drift has the potential for doing the same sort of thing that
happens in biological systems: allowing a piece of functionality to develop
into new pieces of functionality while simultaneously not destroying the
original functionality. If the replication_count on the gene for a given
facet description is 5, perhaps it overwrites some unimportant other gene
(genes expressing really tiny or unrealistically elongated facets late in
the genome could be preferentially targeted for overwriting) and leaves us
with one having a replication_count of 3 and the other with a
replication_count of 2 - aka, the same net result as before. But now these
two distinct genes can drift apart from each other and develop into new
types of functionality.
(As a side note, Scipy's inability to save and resume the population during
differential_evolution optimization used to be really annoying, given how
long CFD optimization tasks take. However, I did find a kind of cheap hack
that I've been using ever since - since the random number generator is
deterministic, I simply have the minimization function create a hash value
for the candidate, and store the results of the simulation in a hash table,
which I save to disk. Then when I need to resume, I just load up the hash
table, and if a candidate has been encountered before, it just
immediately returns the previous run's simulation results rather than
re-running the simulation. It's an awkward hack, and wouldn't work on tasks
where the minimization function is really fast, but for slow tasks like
CFD, it works :) )
Custom strategy_funcs can be of course used for things that have nothing to
do with genes. For example: sometimes - as the docs note - a user may want
part of their candidates' data to be interpreted as integer data. How do
you mutate or crossbreed integers and have them make sense? Well, that's
really going to be task-dependent. Maybe the integer means "number of
iterations" - if so, then perhaps simple interpolation is best. But maybe
it's a category - in that case, interpolation is incoherent, and you should
either keep it the same or randomly pick a new category. And if it is a
category, and that category influences some other values in the candidate,
then that may affect how you want to alter those values. Maybe if
round(candidate[0]) is Category == 3, then you want the floating point
value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5
then maybe you want candidate[1] to be between 1.0 and 5.0. Again, it's
task dependent.
Honestly, there's no limits to what one could do with access to custom
strategy_func implementations. If one wanted, they could outright train a
neural network on candidate values and how well those candidates perform,
and let the neural network mutate candidates, so that the changes aren't
random, but are rather guided by complex statistics about "what sort of
alterations to a previously successful candidate are most likely to make an
even more successful candidate?". Think, say, a protein-folding
optimization task.
(I have no plans to do such a thing personally, but it's just an example of
how far one could take this if needed for complicated, slow tasks)
> > If there are no plans to implement it, I might (or might not, depending)
> > be able to find the time to do so. Though I know 98% of the time
required
> > would not be coding / testing, but rather figuring out how to setup and
> > develop for the scipy test environment and figuring out how to
contribute
> > the changes
>
> THe time split would probably be 5% setup, 30 % writing code, 30% writing
> test cases, 25 % polishing.
I like your optimism, and I'm sure it would be for someone who has
experience like you, but the last time I contributed to a project on
Github, just figuring out how to create the pull request for the finished
code took 40 minutes. ;) I've learned pessimism over the years over how
long it can take to set up dev environments and learn my way around them.
But maybe my pessimism is unjustified here. :)
- kv, Karen