Any followup on this? Thanks! :) - kv, Karen þri., 15. ágú. 2023 kl. 21:30 skrifaði Karen Róbertsdóttir < karen.robertsdottir@gmail.com>:
Thanks for the response! Sorry for any misunderstandings I might have had about the current internal architecture :)
The callable function would be completely responsible for generating trial vectors, i.e. doing the mutation AND recombination.
No issues with that!
A possible call signature would be `strategy_func(candidate, population, rng=None)`
Perfectly fine! But, question:
The strategy_func would be responsible for mutating (blending) members of the population together, doing the crossover/recombination itself, and returning a trial vector with shape (N,).
So it wouldn't be told what other candidate to perform recombination with - it should pick recombination targets itself? I mean, that's workable, just being clear on this.
Note that all entries in population are currently numbers in the range [0, 1], and are scaled to 'actual values' using the lower and upper bounds. If such a feature is added it would be reasonable for the strategy_func to receive population values in their actual range, i.e. scaled from [0, 1] to [lowerlim, upperlim], sent to strategy_func, trial returned from func is then scaled back to [0, 1]. The to/from scaling would add some overhead.
I can't speak for others, but I'm fine with receiving it in the [0, 1] range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit. But whatever your preference is works for me.
It's unclear what you meant by test functions returning a string of 1s .. There are no strings anywhere.
Bad phrasing. "returning a np.array of 1.0s". I wasn't referring to a literal string datatype.
An example would be something like:
---------- def strategy_func(candidate, population, rng=None): return np.ones(len(candidate))
def minimization_function(x): if np.all(x == 1.0): print("Test passed") sys.exit() else: return random.random() ----------
... which would be run with a low population size, ideally just one candidate. So if it's working right, strategy_func returns a candidate that's all 1.0, the minimization function sees it, and passes the test.
If such a feature was added it would be limited to generation of the trial vector only.
Correct. My apologies if my phrasing sounded otherwise.
It's not quite clear to me what your strategy function looks like, it'd be interesting to see an example.
Let me first describe an example from nature, before describing my case.
The human eye has three different types of cones used for colour vision. These are based on different opsin proteins, each of which has specific genes responsible for their production. In particular, the M and L (green and red) cone opsin genes are extremely similar, 96% identical. This probably arose due to a tandem duplication - two copies of the gene arise, one after the other. When this occurs, the initial impact is simply increasing the amount of the original opsin protein production. But now there's two separate genes, and they can drift independently without one affecting the other. So now one drifts toward shorter wavelengths and the other drifts toward longer wavelengths. Now the organism has the ability to distinguish red from green, and can now see the difference between, say, ripe and unripe fruit, or between a snake and a vine - conferring a significant survival advantage!
This couldn't have been possible without the tandem duplication event, without one gene being able to turn into two. This cannot happen in scipy as it exists now. But with a custom strategy_func, it could. If the coder were representing "genes" in whatever their project is as clusters of floating point numbers, they know how they're representing that data, and can include gene duplication events, gene migration events, and so forth.
Now, as for my specific example at present (note: this is not the first time I've wanted such a feature - I use scipy.optimize.minimize a lot, I love it! - but it's the first time I've bothered to get on the mailing list :) ):
One type of project I've used scipy.optimize.differential_evolution for many times is developing 3d models for CFD (Computational Fluid Dynamics) simulations - that is, to evolve optimal shapes for given tasks. Let's say you wanted to evolve a wing - a candidate array might be of the format:
[wingspan, chord, thickness, taper_ratio, twist_angle, incidence_angle, dihedral_angle, sweep_angle .... ]
You'd write a function to generate a 3d wing model from those set of parameters, and then in your minimization function, you generate the model, pass it off to e.g. OpenFOAM to run the simulation, save the forces from the output, and then return a value based on those functions that represents the performance of your wing.
All well and good. And this is what I've done every time so far. But there's a couple problems. The first is obviously that the mesh can only be altered along the specific ways you design it to be altered - it can't invent something innovative. And secondly, writing a function to generate a mesh from parameters can be surprisingly time-consuming and challenging. Including the fact that if the mesh happens to accidentally self-intersect, it'll generate an aphysical model, and then the simulation can do all sorts of crazy things. Every time you start a new project, you have to write a new parameterized model-generation function from scratch.
So this time around I decided to try something new: I want to make a *generic* optimizer. Where there's no hard-coded model-generation function at all for each task - where you can just provide an initial guess model, and it can change it at will. Where one only has to provide constraints and objectives for the simulation on which the results will be evaluated. So then the question comes: how do you represent such a 3d model as a candidate for evolution?
A naive approach would be, "well, a mesh is vertices and faces, so let's just list the vertices, then list their face indices, and call that good." But more than a couple minutes thought shows that this is a disastrous idea. Indices are integers. Mutating from one index to another is a nonsensical change, and usually one that will make a broken mesh. And crossbreeding/recombining meshes also makes no sense - even crossing vertices will often yield incoherent results, let alone crossbreeding face indices!
Instead, I settled on an incremental generative approach. The first 14 floating point parameters of the candidate describe an initial anchor face for the model, and then each subsequent group of 11 floating point parameters describes a new face to be incrementally added to an edge of the model (each of these groups of 11 parameters can be thought of as a gene). That is to say, each edge in the model has an edge anchor ID associated with it, and all still-available edges on which the model can grow on are in an unused_edge_id list. So for each gene (e.g. each new face), it searches through unused_edge_id list, finds the one that most closely matches its anchor ID, and builds itself there. To build a new face, it grows a vector a fixed distance in-plane out of the midpoint of the edge it's being attached to, and that vector is then rotated in-plane (e.g. along the root face's normal) and out-of-plane (e.g. along the attachment edge), to find where to add a new vertex. If the new vertex is close enough to an existing vertex, it snaps to that vertex (allowing the model to create closed shapes) - otherwise it adds a new vertex in place. A face is then formed from the preexisting edge to the new vertex (or preexisting vertex, in the case of snapping), any formerly unused edges are marked as used, and any newly added edges have their anchor IDs added to the unused_edge_id list.
One of the 11 parameters for each gene is replication_count, so that a model can evolve to make many copies of a given added face (each picking the most similar remaining anchor ID, as per before).
Once all the "genes" in the candidate have been processed, the mesh is then run through bpy to use Blender to extrude it by a given thickness (based on each face's material properties, which are also among the 11 floating point parameters), and self-collision checks are also performed. Any defective models return from the minimization function immediately (returning an extremely high value, as a failure). Otherwise the model then goes on to OpenFOAM for a CFD simulation as usual.
(Current status: model generation seems nearly debugged, at least for simple models. Haven't gotten into bpy extrusion or self-collision tests yet. But I figured I should open this conversation re: scipy changes now because they might take some time).
Given the above, you can see how gene duplication (e.g. copying a cluster of 11 floating point parameters and overwriting a different one) followed by genetic drift has the potential for doing the same sort of thing that happens in biological systems: allowing a piece of functionality to develop into new pieces of functionality while simultaneously not destroying the original functionality. If the replication_count on the gene for a given facet description is 5, perhaps it overwrites some unimportant other gene (genes expressing really tiny or unrealistically elongated facets late in the genome could be preferentially targeted for overwriting) and leaves us with one having a replication_count of 3 and the other with a replication_count of 2 - aka, the same net result as before. But now these two distinct genes can drift apart from each other and develop into new types of functionality.
(As a side note, Scipy's inability to save and resume the population during differential_evolution optimization used to be really annoying, given how long CFD optimization tasks take. However, I did find a kind of cheap hack that I've been using ever since - since the random number generator is deterministic, I simply have the minimization function create a hash value for the candidate, and store the results of the simulation in a hash table, which I save to disk. Then when I need to resume, I just load up the hash table, and if a candidate has been encountered before, it just immediately returns the previous run's simulation results rather than re-running the simulation. It's an awkward hack, and wouldn't work on tasks where the minimization function is really fast, but for slow tasks like CFD, it works :) )
Custom strategy_funcs can be of course used for things that have nothing to do with genes. For example: sometimes - as the docs note - a user may want part of their candidates' data to be interpreted as integer data. How do you mutate or crossbreed integers and have them make sense? Well, that's really going to be task-dependent. Maybe the integer means "number of iterations" - if so, then perhaps simple interpolation is best. But maybe it's a category - in that case, interpolation is incoherent, and you should either keep it the same or randomly pick a new category. And if it is a category, and that category influences some other values in the candidate, then that may affect how you want to alter those values. Maybe if round(candidate[0]) is Category == 3, then you want the floating point value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5 then maybe you want candidate[1] to be between 1.0 and 5.0. Again, it's task dependent.
Honestly, there's no limits to what one could do with access to custom strategy_func implementations. If one wanted, they could outright train a neural network on candidate values and how well those candidates perform, and let the neural network mutate candidates, so that the changes aren't random, but are rather guided by complex statistics about "what sort of alterations to a previously successful candidate are most likely to make an even more successful candidate?". Think, say, a protein-folding optimization task.
(I have no plans to do such a thing personally, but it's just an example of how far one could take this if needed for complicated, slow tasks)
If there are no plans to implement it, I might (or might not, depending) be able to find the time to do so. Though I know 98% of the time required would not be coding / testing, but rather figuring out how to setup and develop for the scipy test environment and figuring out how to contribute the changes
THe time split would probably be 5% setup, 30 % writing code, 30% writing test cases, 25 % polishing.
I like your optimism, and I'm sure it would be for someone who has experience like you, but the last time I contributed to a project on Github, just figuring out how to create the pull request for the finished code took 40 minutes. ;) I've learned pessimism over the years over how long it can take to set up dev environments and learn my way around them. But maybe my pessimism is unjustified here. :)
- kv, Karen