Re: DifferentialEvolution: custom mutation and recombination functions?
Thanks for the response! Sorry for any misunderstandings I might have had about the current internal architecture :)
The callable function would be completely responsible for generating trial vectors, i.e. doing the mutation AND recombination.
No issues with that!
A possible call signature would be `strategy_func(candidate, population, rng=None)`
Perfectly fine! But, question:
The strategy_func would be responsible for mutating (blending) members of the population together, doing the crossover/recombination itself, and returning a trial vector with shape (N,).
So it wouldn't be told what other candidate to perform recombination with - it should pick recombination targets itself? I mean, that's workable, just being clear on this.
Note that all entries in population are currently numbers in the range [0, 1], and are scaled to 'actual values' using the lower and upper bounds. If such a feature is added it would be reasonable for the strategy_func to receive population values in their actual range, i.e. scaled from [0, 1] to [lowerlim, upperlim], sent to strategy_func, trial returned from func is then scaled back to [0, 1]. The to/from scaling would add some overhead.
I can't speak for others, but I'm fine with receiving it in the [0, 1] range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit. But whatever your preference is works for me.
It's unclear what you meant by test functions returning a string of 1s .. There are no strings anywhere.
Bad phrasing. "returning a np.array of 1.0s". I wasn't referring to a literal string datatype. An example would be something like: ---------- def strategy_func(candidate, population, rng=None): return np.ones(len(candidate)) def minimization_function(x): if np.all(x == 1.0): print("Test passed") sys.exit() else: return random.random() ---------- ... which would be run with a low population size, ideally just one candidate. So if it's working right, strategy_func returns a candidate that's all 1.0, the minimization function sees it, and passes the test.
If such a feature was added it would be limited to generation of the trial vector only.
Correct. My apologies if my phrasing sounded otherwise.
It's not quite clear to me what your strategy function looks like, it'd be interesting to see an example.
Let me first describe an example from nature, before describing my case. The human eye has three different types of cones used for colour vision. These are based on different opsin proteins, each of which has specific genes responsible for their production. In particular, the M and L (green and red) cone opsin genes are extremely similar, 96% identical. This probably arose due to a tandem duplication - two copies of the gene arise, one after the other. When this occurs, the initial impact is simply increasing the amount of the original opsin protein production. But now there's two separate genes, and they can drift independently without one affecting the other. So now one drifts toward shorter wavelengths and the other drifts toward longer wavelengths. Now the organism has the ability to distinguish red from green, and can now see the difference between, say, ripe and unripe fruit, or between a snake and a vine - conferring a significant survival advantage! This couldn't have been possible without the tandem duplication event, without one gene being able to turn into two. This cannot happen in scipy as it exists now. But with a custom strategy_func, it could. If the coder were representing "genes" in whatever their project is as clusters of floating point numbers, they know how they're representing that data, and can include gene duplication events, gene migration events, and so forth. Now, as for my specific example at present (note: this is not the first time I've wanted such a feature - I use scipy.optimize.minimize a lot, I love it! - but it's the first time I've bothered to get on the mailing list :) ): One type of project I've used scipy.optimize.differential_evolution for many times is developing 3d models for CFD (Computational Fluid Dynamics) simulations - that is, to evolve optimal shapes for given tasks. Let's say you wanted to evolve a wing - a candidate array might be of the format: [wingspan, chord, thickness, taper_ratio, twist_angle, incidence_angle, dihedral_angle, sweep_angle .... ] You'd write a function to generate a 3d wing model from those set of parameters, and then in your minimization function, you generate the model, pass it off to e.g. OpenFOAM to run the simulation, save the forces from the output, and then return a value based on those functions that represents the performance of your wing. All well and good. And this is what I've done every time so far. But there's a couple problems. The first is obviously that the mesh can only be altered along the specific ways you design it to be altered - it can't invent something innovative. And secondly, writing a function to generate a mesh from parameters can be surprisingly time-consuming and challenging. Including the fact that if the mesh happens to accidentally self-intersect, it'll generate an aphysical model, and then the simulation can do all sorts of crazy things. Every time you start a new project, you have to write a new parameterized model-generation function from scratch. So this time around I decided to try something new: I want to make a *generic* optimizer. Where there's no hard-coded model-generation function at all for each task - where you can just provide an initial guess model, and it can change it at will. Where one only has to provide constraints and objectives for the simulation on which the results will be evaluated. So then the question comes: how do you represent such a 3d model as a candidate for evolution? A naive approach would be, "well, a mesh is vertices and faces, so let's just list the vertices, then list their face indices, and call that good." But more than a couple minutes thought shows that this is a disastrous idea. Indices are integers. Mutating from one index to another is a nonsensical change, and usually one that will make a broken mesh. And crossbreeding/recombining meshes also makes no sense - even crossing vertices will often yield incoherent results, let alone crossbreeding face indices! Instead, I settled on an incremental generative approach. The first 14 floating point parameters of the candidate describe an initial anchor face for the model, and then each subsequent group of 11 floating point parameters describes a new face to be incrementally added to an edge of the model (each of these groups of 11 parameters can be thought of as a gene). That is to say, each edge in the model has an edge anchor ID associated with it, and all still-available edges on which the model can grow on are in an unused_edge_id list. So for each gene (e.g. each new face), it searches through unused_edge_id list, finds the one that most closely matches its anchor ID, and builds itself there. To build a new face, it grows a vector a fixed distance in-plane out of the midpoint of the edge it's being attached to, and that vector is then rotated in-plane (e.g. along the root face's normal) and out-of-plane (e.g. along the attachment edge), to find where to add a new vertex. If the new vertex is close enough to an existing vertex, it snaps to that vertex (allowing the model to create closed shapes) - otherwise it adds a new vertex in place. A face is then formed from the preexisting edge to the new vertex (or preexisting vertex, in the case of snapping), any formerly unused edges are marked as used, and any newly added edges have their anchor IDs added to the unused_edge_id list. One of the 11 parameters for each gene is replication_count, so that a model can evolve to make many copies of a given added face (each picking the most similar remaining anchor ID, as per before). Once all the "genes" in the candidate have been processed, the mesh is then run through bpy to use Blender to extrude it by a given thickness (based on each face's material properties, which are also among the 11 floating point parameters), and self-collision checks are also performed. Any defective models return from the minimization function immediately (returning an extremely high value, as a failure). Otherwise the model then goes on to OpenFOAM for a CFD simulation as usual. (Current status: model generation seems nearly debugged, at least for simple models. Haven't gotten into bpy extrusion or self-collision tests yet. But I figured I should open this conversation re: scipy changes now because they might take some time). Given the above, you can see how gene duplication (e.g. copying a cluster of 11 floating point parameters and overwriting a different one) followed by genetic drift has the potential for doing the same sort of thing that happens in biological systems: allowing a piece of functionality to develop into new pieces of functionality while simultaneously not destroying the original functionality. If the replication_count on the gene for a given facet description is 5, perhaps it overwrites some unimportant other gene (genes expressing really tiny or unrealistically elongated facets late in the genome could be preferentially targeted for overwriting) and leaves us with one having a replication_count of 3 and the other with a replication_count of 2 - aka, the same net result as before. But now these two distinct genes can drift apart from each other and develop into new types of functionality. (As a side note, Scipy's inability to save and resume the population during differential_evolution optimization used to be really annoying, given how long CFD optimization tasks take. However, I did find a kind of cheap hack that I've been using ever since - since the random number generator is deterministic, I simply have the minimization function create a hash value for the candidate, and store the results of the simulation in a hash table, which I save to disk. Then when I need to resume, I just load up the hash table, and if a candidate has been encountered before, it just immediately returns the previous run's simulation results rather than re-running the simulation. It's an awkward hack, and wouldn't work on tasks where the minimization function is really fast, but for slow tasks like CFD, it works :) ) Custom strategy_funcs can be of course used for things that have nothing to do with genes. For example: sometimes - as the docs note - a user may want part of their candidates' data to be interpreted as integer data. How do you mutate or crossbreed integers and have them make sense? Well, that's really going to be task-dependent. Maybe the integer means "number of iterations" - if so, then perhaps simple interpolation is best. But maybe it's a category - in that case, interpolation is incoherent, and you should either keep it the same or randomly pick a new category. And if it is a category, and that category influences some other values in the candidate, then that may affect how you want to alter those values. Maybe if round(candidate[0]) is Category == 3, then you want the floating point value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5 then maybe you want candidate[1] to be between 1.0 and 5.0. Again, it's task dependent. Honestly, there's no limits to what one could do with access to custom strategy_func implementations. If one wanted, they could outright train a neural network on candidate values and how well those candidates perform, and let the neural network mutate candidates, so that the changes aren't random, but are rather guided by complex statistics about "what sort of alterations to a previously successful candidate are most likely to make an even more successful candidate?". Think, say, a protein-folding optimization task. (I have no plans to do such a thing personally, but it's just an example of how far one could take this if needed for complicated, slow tasks)
If there are no plans to implement it, I might (or might not, depending) be able to find the time to do so. Though I know 98% of the time required would not be coding / testing, but rather figuring out how to setup and develop for the scipy test environment and figuring out how to contribute the changes
THe time split would probably be 5% setup, 30 % writing code, 30% writing test cases, 25 % polishing.
I like your optimism, and I'm sure it would be for someone who has experience like you, but the last time I contributed to a project on Github, just figuring out how to create the pull request for the finished code took 40 minutes. ;) I've learned pessimism over the years over how long it can take to set up dev environments and learn my way around them. But maybe my pessimism is unjustified here. :) - kv, Karen
Any followup on this? Thanks! :) - kv, Karen þri., 15. ágú. 2023 kl. 21:30 skrifaði Karen Róbertsdóttir < karen.robertsdottir@gmail.com>:
Thanks for the response! Sorry for any misunderstandings I might have had about the current internal architecture :)
The callable function would be completely responsible for generating trial vectors, i.e. doing the mutation AND recombination.
No issues with that!
A possible call signature would be `strategy_func(candidate, population, rng=None)`
Perfectly fine! But, question:
The strategy_func would be responsible for mutating (blending) members of the population together, doing the crossover/recombination itself, and returning a trial vector with shape (N,).
So it wouldn't be told what other candidate to perform recombination with - it should pick recombination targets itself? I mean, that's workable, just being clear on this.
Note that all entries in population are currently numbers in the range [0, 1], and are scaled to 'actual values' using the lower and upper bounds. If such a feature is added it would be reasonable for the strategy_func to receive population values in their actual range, i.e. scaled from [0, 1] to [lowerlim, upperlim], sent to strategy_func, trial returned from func is then scaled back to [0, 1]. The to/from scaling would add some overhead.
I can't speak for others, but I'm fine with receiving it in the [0, 1] range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit. But whatever your preference is works for me.
It's unclear what you meant by test functions returning a string of 1s .. There are no strings anywhere.
Bad phrasing. "returning a np.array of 1.0s". I wasn't referring to a literal string datatype.
An example would be something like:
---------- def strategy_func(candidate, population, rng=None): return np.ones(len(candidate))
def minimization_function(x): if np.all(x == 1.0): print("Test passed") sys.exit() else: return random.random() ----------
... which would be run with a low population size, ideally just one candidate. So if it's working right, strategy_func returns a candidate that's all 1.0, the minimization function sees it, and passes the test.
If such a feature was added it would be limited to generation of the trial vector only.
Correct. My apologies if my phrasing sounded otherwise.
It's not quite clear to me what your strategy function looks like, it'd be interesting to see an example.
Let me first describe an example from nature, before describing my case.
The human eye has three different types of cones used for colour vision. These are based on different opsin proteins, each of which has specific genes responsible for their production. In particular, the M and L (green and red) cone opsin genes are extremely similar, 96% identical. This probably arose due to a tandem duplication - two copies of the gene arise, one after the other. When this occurs, the initial impact is simply increasing the amount of the original opsin protein production. But now there's two separate genes, and they can drift independently without one affecting the other. So now one drifts toward shorter wavelengths and the other drifts toward longer wavelengths. Now the organism has the ability to distinguish red from green, and can now see the difference between, say, ripe and unripe fruit, or between a snake and a vine - conferring a significant survival advantage!
This couldn't have been possible without the tandem duplication event, without one gene being able to turn into two. This cannot happen in scipy as it exists now. But with a custom strategy_func, it could. If the coder were representing "genes" in whatever their project is as clusters of floating point numbers, they know how they're representing that data, and can include gene duplication events, gene migration events, and so forth.
Now, as for my specific example at present (note: this is not the first time I've wanted such a feature - I use scipy.optimize.minimize a lot, I love it! - but it's the first time I've bothered to get on the mailing list :) ):
One type of project I've used scipy.optimize.differential_evolution for many times is developing 3d models for CFD (Computational Fluid Dynamics) simulations - that is, to evolve optimal shapes for given tasks. Let's say you wanted to evolve a wing - a candidate array might be of the format:
[wingspan, chord, thickness, taper_ratio, twist_angle, incidence_angle, dihedral_angle, sweep_angle .... ]
You'd write a function to generate a 3d wing model from those set of parameters, and then in your minimization function, you generate the model, pass it off to e.g. OpenFOAM to run the simulation, save the forces from the output, and then return a value based on those functions that represents the performance of your wing.
All well and good. And this is what I've done every time so far. But there's a couple problems. The first is obviously that the mesh can only be altered along the specific ways you design it to be altered - it can't invent something innovative. And secondly, writing a function to generate a mesh from parameters can be surprisingly time-consuming and challenging. Including the fact that if the mesh happens to accidentally self-intersect, it'll generate an aphysical model, and then the simulation can do all sorts of crazy things. Every time you start a new project, you have to write a new parameterized model-generation function from scratch.
So this time around I decided to try something new: I want to make a *generic* optimizer. Where there's no hard-coded model-generation function at all for each task - where you can just provide an initial guess model, and it can change it at will. Where one only has to provide constraints and objectives for the simulation on which the results will be evaluated. So then the question comes: how do you represent such a 3d model as a candidate for evolution?
A naive approach would be, "well, a mesh is vertices and faces, so let's just list the vertices, then list their face indices, and call that good." But more than a couple minutes thought shows that this is a disastrous idea. Indices are integers. Mutating from one index to another is a nonsensical change, and usually one that will make a broken mesh. And crossbreeding/recombining meshes also makes no sense - even crossing vertices will often yield incoherent results, let alone crossbreeding face indices!
Instead, I settled on an incremental generative approach. The first 14 floating point parameters of the candidate describe an initial anchor face for the model, and then each subsequent group of 11 floating point parameters describes a new face to be incrementally added to an edge of the model (each of these groups of 11 parameters can be thought of as a gene). That is to say, each edge in the model has an edge anchor ID associated with it, and all still-available edges on which the model can grow on are in an unused_edge_id list. So for each gene (e.g. each new face), it searches through unused_edge_id list, finds the one that most closely matches its anchor ID, and builds itself there. To build a new face, it grows a vector a fixed distance in-plane out of the midpoint of the edge it's being attached to, and that vector is then rotated in-plane (e.g. along the root face's normal) and out-of-plane (e.g. along the attachment edge), to find where to add a new vertex. If the new vertex is close enough to an existing vertex, it snaps to that vertex (allowing the model to create closed shapes) - otherwise it adds a new vertex in place. A face is then formed from the preexisting edge to the new vertex (or preexisting vertex, in the case of snapping), any formerly unused edges are marked as used, and any newly added edges have their anchor IDs added to the unused_edge_id list.
One of the 11 parameters for each gene is replication_count, so that a model can evolve to make many copies of a given added face (each picking the most similar remaining anchor ID, as per before).
Once all the "genes" in the candidate have been processed, the mesh is then run through bpy to use Blender to extrude it by a given thickness (based on each face's material properties, which are also among the 11 floating point parameters), and self-collision checks are also performed. Any defective models return from the minimization function immediately (returning an extremely high value, as a failure). Otherwise the model then goes on to OpenFOAM for a CFD simulation as usual.
(Current status: model generation seems nearly debugged, at least for simple models. Haven't gotten into bpy extrusion or self-collision tests yet. But I figured I should open this conversation re: scipy changes now because they might take some time).
Given the above, you can see how gene duplication (e.g. copying a cluster of 11 floating point parameters and overwriting a different one) followed by genetic drift has the potential for doing the same sort of thing that happens in biological systems: allowing a piece of functionality to develop into new pieces of functionality while simultaneously not destroying the original functionality. If the replication_count on the gene for a given facet description is 5, perhaps it overwrites some unimportant other gene (genes expressing really tiny or unrealistically elongated facets late in the genome could be preferentially targeted for overwriting) and leaves us with one having a replication_count of 3 and the other with a replication_count of 2 - aka, the same net result as before. But now these two distinct genes can drift apart from each other and develop into new types of functionality.
(As a side note, Scipy's inability to save and resume the population during differential_evolution optimization used to be really annoying, given how long CFD optimization tasks take. However, I did find a kind of cheap hack that I've been using ever since - since the random number generator is deterministic, I simply have the minimization function create a hash value for the candidate, and store the results of the simulation in a hash table, which I save to disk. Then when I need to resume, I just load up the hash table, and if a candidate has been encountered before, it just immediately returns the previous run's simulation results rather than re-running the simulation. It's an awkward hack, and wouldn't work on tasks where the minimization function is really fast, but for slow tasks like CFD, it works :) )
Custom strategy_funcs can be of course used for things that have nothing to do with genes. For example: sometimes - as the docs note - a user may want part of their candidates' data to be interpreted as integer data. How do you mutate or crossbreed integers and have them make sense? Well, that's really going to be task-dependent. Maybe the integer means "number of iterations" - if so, then perhaps simple interpolation is best. But maybe it's a category - in that case, interpolation is incoherent, and you should either keep it the same or randomly pick a new category. And if it is a category, and that category influences some other values in the candidate, then that may affect how you want to alter those values. Maybe if round(candidate[0]) is Category == 3, then you want the floating point value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5 then maybe you want candidate[1] to be between 1.0 and 5.0. Again, it's task dependent.
Honestly, there's no limits to what one could do with access to custom strategy_func implementations. If one wanted, they could outright train a neural network on candidate values and how well those candidates perform, and let the neural network mutate candidates, so that the changes aren't random, but are rather guided by complex statistics about "what sort of alterations to a previously successful candidate are most likely to make an even more successful candidate?". Think, say, a protein-folding optimization task.
(I have no plans to do such a thing personally, but it's just an example of how far one could take this if needed for complicated, slow tasks)
If there are no plans to implement it, I might (or might not, depending) be able to find the time to do so. Though I know 98% of the time required would not be coding / testing, but rather figuring out how to setup and develop for the scipy test environment and figuring out how to contribute the changes
THe time split would probably be 5% setup, 30 % writing code, 30% writing test cases, 25 % polishing.
I like your optimism, and I'm sure it would be for someone who has experience like you, but the last time I contributed to a project on Github, just figuring out how to create the pull request for the finished code took 40 minutes. ;) I've learned pessimism over the years over how long it can take to set up dev environments and learn my way around them. But maybe my pessimism is unjustified here. :)
- kv, Karen
On Wed, Aug 16, 2023, 22:23 Karen Róbertsdóttir < karen.robertsdottir@gmail.com> wrote:
Perfectly fine! But, question:
The strategy_func would be responsible for mutating (blending) members of the population together, doing the crossover/recombination itself, and returning a trial vector with shape (N,).
So it wouldn't be told what other candidate to perform recombination with - it should pick recombination targets itself? I mean, that's workable, just being clear on this.
The strategy function would be solely responsible for creating a trial vector. It could do anything it wanted, so long as it returned a trial vector the same shape as the problem description. Whether the strategy function was sensible would be solely at the discretion of the user. The fitness of the trial function is determined outside the strategy function.
I can't speak for others, but I'm fine with receiving it in the [0, 1] range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit. But whatever your preference is works for me.
On further reflection it makes sense to supply the population in the [ bounds.lb, bounds.ub range]. It would be a copy of the population, so the original array wasn't overwritten by the user. (As a side note, Scipy's inability to save and resume the population during
differential_evolution optimization used to be really annoying, given how long CFD optimization tasks take. However, I did find a kind of cheap hack that I've been using ever since - since the random number generator is deterministic, I simply have the minimization function create a hash value for the candidate, and store the results of the simulation in a hash table, which I save to disk. Then when I need to resume, I just load up the hash table, and if a candidate has been encountered before, it just immediately returns the previous run's simulation results rather than re-running the simulation. It's an awkward hack, and wouldn't work on tasks where the minimization function is really fast, but for slow tasks like CFD, it works :) )
If you use the DifferentialEvolutionSolver (warning, it's private and subject to change), then you can step the solver very easily, it's an iterator. For a single run through it should be very rare that the fitness of a given vector is evaluated twice, only trial vectors are evaluated, and they're always novel. If you want to stop/restart then I understand the need to cache. TBH using very expensive objective functions doesn't sound great for differential_evolution though, there's always a lot of function evaluations. Custom strategy_funcs can be of course used for things that have nothing to
do with genes. For example: sometimes - as the docs note - a user may want part of their candidates' data to be interpreted as integer data. How do you mutate or crossbreed integers and have them make sense? Well, that's really going to be task-dependent. Maybe the integer means "number of iterations" - if so, then perhaps simple interpolation is best. But maybe it's a category - in that case, interpolation is incoherent, and you should either keep it the same or randomly pick a new category. And if it is a category, and that category influences some other values in the candidate, then that may affect how you want to alter those values. Maybe if round(candidate[0]) is Category == 3, then you want the floating point value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5 then maybe you want candidate[1] to be between 1.0 and 5.0. Again, it's task dependent.
`differential_evolution` already has an integrality keyword. w.r.t implementation, it just needs someone to do the programming and write tests. We always welcome new contributors for PRs. A.
Thanks for the response!
I can't speak for others, but I'm fine with receiving it in the [0, 1]
range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit. But whatever your preference is works for me.
On further reflection it makes sense to supply the population in the [ bounds.lb, bounds.ub range]. It would be a copy of the population, so the original array wasn't overwritten by the user.
Makes sense.
TBH using very expensive objective functions doesn't sound great for
differential_evolution though, there's always a lot of function evaluations.
Do you have an alternative solution for a task that is prone to local minima, benefits greatly from crossing with other population members, and would benefit from custom mutation function capabilities? The sort of tasks I'm doing are pretty close to the constraints of literal physical evolution, e.g. the evolution of physical forms optimized to physical tasks. The fact that CFD is slow (esp. on certain tasks like combustion modeling :Þ ) can't be helped, except by throwing lots of compute resources at it and not overcomplicating the simulation. The upside is that perfection is not needed - there's no need to refine to fine tolerances.
w.r.t implementation, it just needs someone to do the programming and write tests. We always welcome new contributors for PRs.
On that note I tried setting up the dev environment, on a new branch. After two hours of work I got it built. But now I can neither rebuild it nor run tests: scipy]$ python dev.py test -v 💻 ninja -C /path/to/scipy/build -j6 ninja: Entering directory `/path/to/scipy/build' [4/4] Generating scipy/generate-version with a custom command Build OK Task Error - build => PythonAction Error ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── build ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Traceback (most recent call last): │ │ File "/path/to/.local/lib/python3.10/site-packages/doit/action.py", line 461, in execute │ │ returned_value = self.py_callable(*self.args, **kwargs) │ │ File "/path/to/scipy/dev.py", line 668, in run │ │ cls.install_project(dirs, args) │ │ File "/path/to/scipy/dev.py", line 533, in install_project │ │ raise RuntimeError("Can't install in non-empty directory: " │ │ RuntimeError: Can't install in non-empty directory: '/path/to/scipy/build-install' │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────── PythonAction Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ Does it expect me to manually delete the build directory every time, or what? If so, that may raise a problem, because I had to build with "*python dev.py build -C-Dblas=blas -C-Dlapack=lapack"*, but it refuses to take those -C options with "python dev.py test -v". Honestly, it's strange that "test" would insist on the build directory being nonexistent in the first place. Perhaps I'm misunderstanding something. - kv, Karen _______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
On Tue, 22 Aug 2023 at 22:32, Karen Róbertsdóttir < karen.robertsdottir@gmail.com> wrote:
On that note I tried setting up the dev environment, on a new branch. After two hours of work I got it built. But now I can neither rebuild it nor run tests:
I'm sorry that the build is hard to do. The scipy webpage has some information on how to build, https://docs.scipy.org/doc/scipy-1.10.1/dev/dev_quickstart.html. You shouldn't have to rebuild time after time, you should just be able to run `python dev.py test`, tweak, `python dev.py test` without having to do anything else. In this circumstance I might suggest removing the build and build-install directories and just running `python dev.py test` multiple times. Unless you want to specify specific BLAS libraries you shouldn't have to use any flags. If you're on Linux, you can just install system BLAS.
scipy]$ python dev.py test -v 💻 ninja -C /path/to/scipy/build -j6 ninja: Entering directory `/path/to/scipy/build' [4/4] Generating scipy/generate-version with a custom command Build OK Task Error - build => PythonAction Error ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── build ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Traceback (most recent call last):
│ │ File "/path/to/.local/lib/python3.10/site-packages/doit/action.py", line 461, in execute
│ │ returned_value = self.py_callable(*self.args, **kwargs)
│ │ File "/path/to/scipy/dev.py", line 668, in run
│ │ cls.install_project(dirs, args)
│ │ File "/path/to/scipy/dev.py", line 533, in install_project
│ │ raise RuntimeError("Can't install in non-empty directory: "
│ │ RuntimeError: Can't install in non-empty directory: '/path/to/scipy/build-install'
│ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────── PythonAction Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Does it expect me to manually delete the build directory every time, or what? If so, that may raise a problem, because I had to build with "*python dev.py build -C-Dblas=blas -C-Dlapack=lapack"*, but it refuses to take those -C options with "python dev.py test -v". Honestly, it's strange that "test" would insist on the build directory being nonexistent in the first place. Perhaps I'm misunderstanding something.
- kv, Karen
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: andyfaff@gmail.com
-- _____________________________________ Dr. Andrew Nelson _____________________________________
The scipy webpage has some information on how to build, https://docs.scipy.org/doc/scipy-1.10.1/dev/dev_quickstart.html.
Yeah, I've been using that, as well as build guides on the site. You shouldn't have to rebuild time after time, you should just be able to
run `python dev.py test`
That gives the error I posted before. In this circumstance I might suggest removing the build and build-install
directories and just running `python dev.py test` multiple times.
scipy]$ python dev.py test 💻 meson setup /path/to/scipy/build --prefix /path/to/scipy/build-install The Meson build system Version: 1.2.1 Source dir: /path/to/scipy Build dir: /path/to/scipy/build Build type: native build Project name: SciPy Project version: 1.12.0.dev0 C compiler for the host machine: cc (gcc 11.3.0 "cc (Homebrew GCC 11.3.0) 11.3.0") C linker for the host machine: cc ld.bfd 2.37-38 C++ compiler for the host machine: c++ (gcc 11.3.0 "c++ (Homebrew GCC 11.3.0) 11.3.0") C++ linker for the host machine: c++ ld.bfd 2.37-38 Cython compiler for the host machine: cython (cython 3.0.0) Host machine cpu family: x86_64 Host machine cpu: x86_64 Program python3 found: YES (/usr/bin/python) Found pkg-config: /usr/bin/pkg-config (1.8.0) Run-time dependency python found: YES 3.10 Program cython found: YES (/usr/local/bin/cython) Compiler for C supports arguments -Wno-unused-but-set-variable: YES Compiler for C supports arguments -Wno-unused-function: YES Compiler for C supports arguments -Wno-conversion: YES Compiler for C supports arguments -Wno-misleading-indentation: YES Library m found: YES Fortran compiler for the host machine: gfortran (gcc 12.2.1 "GNU Fortran (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)") Fortran linker for the host machine: gfortran ld.bfd 2.37-38 Compiler for Fortran supports arguments -Wno-conversion: YES Checking if "-Wl,--version-script" : links: YES Program pythran found: YES (/path/to/.local/bin/pythran) Found CMake: /usr/local/bin/cmake (3.24.1) WARNING: CMake Toolchain: Failed to determine CMake compilers state Run-time dependency xsimd found: NO (tried pkgconfig and cmake) Run-time dependency threads found: YES Library npymath found: YES Library npyrandom found: YES pybind11-config found: YES (/path/to/.local/bin/pybind11-config) 2.11.1 Run-time dependency pybind11 found: YES 2.11.1 Run-time dependency openblas found: NO (tried pkgconfig and cmake) Run-time dependency openblas found: NO (tried pkgconfig and cmake) scipy/meson.build:159:9: ERROR: Dependency "OpenBLAS" not found, tried pkgconfig and cmake A full log can be found at /path/to/scipy/build/meson-logs/meson-log.txt Meson build setup failed! --- Hence I used the -C arguments, as suggested by one of the build guides.
Unless you want to specify specific BLAS libraries you shouldn't have to use any flags
As you can see, I do.
If you're on Linux, you can just install system BLAS.
scipy]$ rpm -qa | grep -i blas | sort blas-3.10.1-1.fc36.x86_64 blas64-3.10.1-1.fc36.x86_64 blas64_-3.10.1-1.fc36.x86_64 blas-devel-3.10.1-1.fc36.x86_64 flexiblas-3.3.0-1.fc36.x86_64 flexiblas-netlib-3.3.0-1.fc36.x86_64 flexiblas-netlib64-3.3.0-1.fc36.x86_64 flexiblas-openblas-openmp-3.3.0-1.fc36.x86_64 flexiblas-openblas-openmp64-3.3.0-1.fc36.x86_64 libcublas-12-0-12.0.1.189-1.x86_64 libcublas-devel-12-0-12.0.1.189-1.x86_64 liblas-1.8.1-19.gitd76a061.fc36.x86_64 liblas-devel-1.8.1-19.gitd76a061.fc36.x86_64 openblas-0.3.19-3.fc36.x86_64 openblas-devel-0.3.19-3.fc36.x86_64 openblas-openmp-0.3.19-3.fc36.x86_64 openblas-openmp64-0.3.19-3.fc36.x86_64 openblas-openmp64_-0.3.19-3.fc36.x86_64 openblas-serial-0.3.19-3.fc36.x86_64 openblas-serial64-0.3.19-3.fc36.x86_64 openblas-serial64_-0.3.19-3.fc36.x86_64 openblas-srpm-macros-2-11.fc36.noarch openblas-threads-0.3.19-3.fc36.x86_64 openblas-threads64-0.3.19-3.fc36.x86_64 openblas-threads64_-0.3.19-3.fc36.x86 - kv, Karen
scipy]$ python dev.py test -v 💻 ninja -C /path/to/scipy/build -j6 ninja: Entering directory `/path/to/scipy/build' [4/4] Generating scipy/generate-version with a custom command Build OK Task Error - build => PythonAction Error ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── build ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Traceback (most recent call last):
│ │ File "/path/to/.local/lib/python3.10/site-packages/doit/action.py", line 461, in execute
│ │ returned_value = self.py_callable(*self.args, **kwargs)
│ │ File "/path/to/scipy/dev.py", line 668, in run
│ │ cls.install_project(dirs, args)
│ │ File "/path/to/scipy/dev.py", line 533, in install_project
│ │ raise RuntimeError("Can't install in non-empty directory: "
│ │ RuntimeError: Can't install in non-empty directory: '/path/to/scipy/build-install'
│ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────── PythonAction Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Does it expect me to manually delete the build directory every time, or what? If so, that may raise a problem, because I had to build with "*python dev.py build -C-Dblas=blas -C-Dlapack=lapack"*, but it refuses to take those -C options with "python dev.py test -v". Honestly, it's strange that "test" would insist on the build directory being nonexistent in the first place. Perhaps I'm misunderstanding something.
- kv, Karen
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: andyfaff@gmail.com
-- _____________________________________ Dr. Andrew Nelson
_____________________________________ _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
Just following up on this :) þri., 22. ágú. 2023 kl. 23:27 skrifaði Karen Róbertsdóttir < karen.robertsdottir@gmail.com>:
The scipy webpage has some information on how to build, https://docs.scipy.org/doc/scipy-1.10.1/dev/dev_quickstart.html.
Yeah, I've been using that, as well as build guides on the site.
You shouldn't have to rebuild time after time, you should just be able to
run `python dev.py test`
That gives the error I posted before.
In this circumstance I might suggest removing the build and build-install
directories and just running `python dev.py test` multiple times.
scipy]$ python dev.py test 💻 meson setup /path/to/scipy/build --prefix /path/to/scipy/build-install The Meson build system Version: 1.2.1 Source dir: /path/to/scipy Build dir: /path/to/scipy/build Build type: native build Project name: SciPy Project version: 1.12.0.dev0 C compiler for the host machine: cc (gcc 11.3.0 "cc (Homebrew GCC 11.3.0) 11.3.0") C linker for the host machine: cc ld.bfd 2.37-38 C++ compiler for the host machine: c++ (gcc 11.3.0 "c++ (Homebrew GCC 11.3.0) 11.3.0") C++ linker for the host machine: c++ ld.bfd 2.37-38 Cython compiler for the host machine: cython (cython 3.0.0) Host machine cpu family: x86_64 Host machine cpu: x86_64 Program python3 found: YES (/usr/bin/python) Found pkg-config: /usr/bin/pkg-config (1.8.0) Run-time dependency python found: YES 3.10 Program cython found: YES (/usr/local/bin/cython) Compiler for C supports arguments -Wno-unused-but-set-variable: YES Compiler for C supports arguments -Wno-unused-function: YES Compiler for C supports arguments -Wno-conversion: YES Compiler for C supports arguments -Wno-misleading-indentation: YES Library m found: YES Fortran compiler for the host machine: gfortran (gcc 12.2.1 "GNU Fortran (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)") Fortran linker for the host machine: gfortran ld.bfd 2.37-38 Compiler for Fortran supports arguments -Wno-conversion: YES Checking if "-Wl,--version-script" : links: YES Program pythran found: YES (/path/to/.local/bin/pythran) Found CMake: /usr/local/bin/cmake (3.24.1) WARNING: CMake Toolchain: Failed to determine CMake compilers state Run-time dependency xsimd found: NO (tried pkgconfig and cmake) Run-time dependency threads found: YES Library npymath found: YES Library npyrandom found: YES pybind11-config found: YES (/path/to/.local/bin/pybind11-config) 2.11.1 Run-time dependency pybind11 found: YES 2.11.1 Run-time dependency openblas found: NO (tried pkgconfig and cmake) Run-time dependency openblas found: NO (tried pkgconfig and cmake)
scipy/meson.build:159:9: ERROR: Dependency "OpenBLAS" not found, tried pkgconfig and cmake
A full log can be found at /path/to/scipy/build/meson-logs/meson-log.txt Meson build setup failed!
--- Hence I used the -C arguments, as suggested by one of the build guides.
Unless you want to specify specific BLAS libraries you shouldn't have to use any flags
As you can see, I do.
If you're on Linux, you can just install system BLAS.
scipy]$ rpm -qa | grep -i blas | sort blas-3.10.1-1.fc36.x86_64 blas64-3.10.1-1.fc36.x86_64 blas64_-3.10.1-1.fc36.x86_64 blas-devel-3.10.1-1.fc36.x86_64 flexiblas-3.3.0-1.fc36.x86_64 flexiblas-netlib-3.3.0-1.fc36.x86_64 flexiblas-netlib64-3.3.0-1.fc36.x86_64 flexiblas-openblas-openmp-3.3.0-1.fc36.x86_64 flexiblas-openblas-openmp64-3.3.0-1.fc36.x86_64 libcublas-12-0-12.0.1.189-1.x86_64 libcublas-devel-12-0-12.0.1.189-1.x86_64 liblas-1.8.1-19.gitd76a061.fc36.x86_64 liblas-devel-1.8.1-19.gitd76a061.fc36.x86_64 openblas-0.3.19-3.fc36.x86_64 openblas-devel-0.3.19-3.fc36.x86_64 openblas-openmp-0.3.19-3.fc36.x86_64 openblas-openmp64-0.3.19-3.fc36.x86_64 openblas-openmp64_-0.3.19-3.fc36.x86_64 openblas-serial-0.3.19-3.fc36.x86_64 openblas-serial64-0.3.19-3.fc36.x86_64 openblas-serial64_-0.3.19-3.fc36.x86_64 openblas-srpm-macros-2-11.fc36.noarch openblas-threads-0.3.19-3.fc36.x86_64 openblas-threads64-0.3.19-3.fc36.x86_64 openblas-threads64_-0.3.19-3.fc36.x86
- kv, Karen
scipy]$ python dev.py test -v 💻 ninja -C /path/to/scipy/build -j6 ninja: Entering directory `/path/to/scipy/build' [4/4] Generating scipy/generate-version with a custom command Build OK Task Error - build => PythonAction Error ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── build ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Traceback (most recent call last):
│ │ File "/path/to/.local/lib/python3.10/site-packages/doit/action.py", line 461, in execute
│ │ returned_value = self.py_callable(*self.args, **kwargs)
│ │ File "/path/to/scipy/dev.py", line 668, in run
│ │ cls.install_project(dirs, args)
│ │ File "/path/to/scipy/dev.py", line 533, in install_project
│ │ raise RuntimeError("Can't install in non-empty directory: "
│ │ RuntimeError: Can't install in non-empty directory: '/path/to/scipy/build-install'
│ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────── PythonAction Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Does it expect me to manually delete the build directory every time, or what? If so, that may raise a problem, because I had to build with "*python dev.py build -C-Dblas=blas -C-Dlapack=lapack"*, but it refuses to take those -C options with "python dev.py test -v". Honestly, it's strange that "test" would insist on the build directory being nonexistent in the first place. Perhaps I'm misunderstanding something.
- kv, Karen
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: andyfaff@gmail.com
-- _____________________________________ Dr. Andrew Nelson
_____________________________________ _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
On Mon, Aug 28, 2023 at 1:37 PM Karen Róbertsdóttir < karen.robertsdottir@gmail.com> wrote:
Just following up on this :)
Hi Karen, I just tried this on latest `main`: $ git clean -xdf $ python dev.py build -C-Dblas=blas -C-Dlapack=lapack $ python dev.py test That works as advertised for me, and starts running the tests straight away. I don't quite see why you got the initial error you saw. If the above still doesn't work for you in a clean repo, can you please open an issue with the full logs and Cc me (@rgommers)? It's easier to help with build issues on an issue than on the mailing list. Cheers, Ralf
þri., 22. ágú. 2023 kl. 23:27 skrifaði Karen Róbertsdóttir < karen.robertsdottir@gmail.com>:
The scipy webpage has some information on how to build, https://docs.scipy.org/doc/scipy-1.10.1/dev/dev_quickstart.html.
Yeah, I've been using that, as well as build guides on the site.
You shouldn't have to rebuild time after time, you should just be able to
run `python dev.py test`
That gives the error I posted before.
In this circumstance I might suggest removing the build and build-install
directories and just running `python dev.py test` multiple times.
scipy]$ python dev.py test 💻 meson setup /path/to/scipy/build --prefix /path/to/scipy/build-install The Meson build system Version: 1.2.1 Source dir: /path/to/scipy Build dir: /path/to/scipy/build Build type: native build Project name: SciPy Project version: 1.12.0.dev0 C compiler for the host machine: cc (gcc 11.3.0 "cc (Homebrew GCC 11.3.0) 11.3.0") C linker for the host machine: cc ld.bfd 2.37-38 C++ compiler for the host machine: c++ (gcc 11.3.0 "c++ (Homebrew GCC 11.3.0) 11.3.0") C++ linker for the host machine: c++ ld.bfd 2.37-38 Cython compiler for the host machine: cython (cython 3.0.0) Host machine cpu family: x86_64 Host machine cpu: x86_64 Program python3 found: YES (/usr/bin/python) Found pkg-config: /usr/bin/pkg-config (1.8.0) Run-time dependency python found: YES 3.10 Program cython found: YES (/usr/local/bin/cython) Compiler for C supports arguments -Wno-unused-but-set-variable: YES Compiler for C supports arguments -Wno-unused-function: YES Compiler for C supports arguments -Wno-conversion: YES Compiler for C supports arguments -Wno-misleading-indentation: YES Library m found: YES Fortran compiler for the host machine: gfortran (gcc 12.2.1 "GNU Fortran (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)") Fortran linker for the host machine: gfortran ld.bfd 2.37-38 Compiler for Fortran supports arguments -Wno-conversion: YES Checking if "-Wl,--version-script" : links: YES Program pythran found: YES (/path/to/.local/bin/pythran) Found CMake: /usr/local/bin/cmake (3.24.1) WARNING: CMake Toolchain: Failed to determine CMake compilers state Run-time dependency xsimd found: NO (tried pkgconfig and cmake) Run-time dependency threads found: YES Library npymath found: YES Library npyrandom found: YES pybind11-config found: YES (/path/to/.local/bin/pybind11-config) 2.11.1 Run-time dependency pybind11 found: YES 2.11.1 Run-time dependency openblas found: NO (tried pkgconfig and cmake) Run-time dependency openblas found: NO (tried pkgconfig and cmake)
scipy/meson.build:159:9: ERROR: Dependency "OpenBLAS" not found, tried pkgconfig and cmake
A full log can be found at /path/to/scipy/build/meson-logs/meson-log.txt Meson build setup failed!
--- Hence I used the -C arguments, as suggested by one of the build guides.
Unless you want to specify specific BLAS libraries you shouldn't have to use any flags
As you can see, I do.
If you're on Linux, you can just install system BLAS.
scipy]$ rpm -qa | grep -i blas | sort blas-3.10.1-1.fc36.x86_64 blas64-3.10.1-1.fc36.x86_64 blas64_-3.10.1-1.fc36.x86_64 blas-devel-3.10.1-1.fc36.x86_64 flexiblas-3.3.0-1.fc36.x86_64 flexiblas-netlib-3.3.0-1.fc36.x86_64 flexiblas-netlib64-3.3.0-1.fc36.x86_64 flexiblas-openblas-openmp-3.3.0-1.fc36.x86_64 flexiblas-openblas-openmp64-3.3.0-1.fc36.x86_64 libcublas-12-0-12.0.1.189-1.x86_64 libcublas-devel-12-0-12.0.1.189-1.x86_64 liblas-1.8.1-19.gitd76a061.fc36.x86_64 liblas-devel-1.8.1-19.gitd76a061.fc36.x86_64 openblas-0.3.19-3.fc36.x86_64 openblas-devel-0.3.19-3.fc36.x86_64 openblas-openmp-0.3.19-3.fc36.x86_64 openblas-openmp64-0.3.19-3.fc36.x86_64 openblas-openmp64_-0.3.19-3.fc36.x86_64 openblas-serial-0.3.19-3.fc36.x86_64 openblas-serial64-0.3.19-3.fc36.x86_64 openblas-serial64_-0.3.19-3.fc36.x86_64 openblas-srpm-macros-2-11.fc36.noarch openblas-threads-0.3.19-3.fc36.x86_64 openblas-threads64-0.3.19-3.fc36.x86_64 openblas-threads64_-0.3.19-3.fc36.x86
- kv, Karen
scipy]$ python dev.py test -v 💻 ninja -C /path/to/scipy/build -j6 ninja: Entering directory `/path/to/scipy/build' [4/4] Generating scipy/generate-version with a custom command Build OK Task Error - build => PythonAction Error ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── build ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Traceback (most recent call last):
│ │ File "/path/to/.local/lib/python3.10/site-packages/doit/action.py", line 461, in execute
│ │ returned_value = self.py_callable(*self.args, **kwargs)
│ │ File "/path/to/scipy/dev.py", line 668, in run
│ │ cls.install_project(dirs, args)
│ │ File "/path/to/scipy/dev.py", line 533, in install_project
│ │ raise RuntimeError("Can't install in non-empty directory: "
│ │ RuntimeError: Can't install in non-empty directory: '/path/to/scipy/build-install'
│ │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────── PythonAction Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Does it expect me to manually delete the build directory every time, or what? If so, that may raise a problem, because I had to build with "*python dev.py build -C-Dblas=blas -C-Dlapack=lapack"*, but it refuses to take those -C options with "python dev.py test -v". Honestly, it's strange that "test" would insist on the build directory being nonexistent in the first place. Perhaps I'm misunderstanding something.
- kv, Karen
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: andyfaff@gmail.com
-- _____________________________________ Dr. Andrew Nelson
_____________________________________ _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: ralf.gommers@googlemail.com
Thanks to everyone (incl. Andy!) who helped resolve my issues setting up the development environment. :) I now have a pull request open for this feature. https://github.com/scipy/scipy/pull/19196 - kv, Karen þri., 22. ágú. 2023 kl. 05:40 skrifaði Andrew Nelson <andyfaff@gmail.com>:
On Wed, Aug 16, 2023, 22:23 Karen Róbertsdóttir < karen.robertsdottir@gmail.com> wrote:
Perfectly fine! But, question:
The strategy_func would be responsible for mutating (blending) members of the population together, doing the crossover/recombination itself, and returning a trial vector with shape (N,).
So it wouldn't be told what other candidate to perform recombination with - it should pick recombination targets itself? I mean, that's workable, just being clear on this.
The strategy function would be solely responsible for creating a trial vector. It could do anything it wanted, so long as it returned a trial vector the same shape as the problem description. Whether the strategy function was sensible would be solely at the discretion of the user. The fitness of the trial function is determined outside the strategy function.
I can't speak for others, but I'm fine with receiving it in the [0, 1] range and then scaling it myself, to avoid the need for the stock functions to take that slight overhead hit. But whatever your preference is works for me.
On further reflection it makes sense to supply the population in the [ bounds.lb, bounds.ub range]. It would be a copy of the population, so the original array wasn't overwritten by the user.
(As a side note, Scipy's inability to save and resume the population
during differential_evolution optimization used to be really annoying, given how long CFD optimization tasks take. However, I did find a kind of cheap hack that I've been using ever since - since the random number generator is deterministic, I simply have the minimization function create a hash value for the candidate, and store the results of the simulation in a hash table, which I save to disk. Then when I need to resume, I just load up the hash table, and if a candidate has been encountered before, it just immediately returns the previous run's simulation results rather than re-running the simulation. It's an awkward hack, and wouldn't work on tasks where the minimization function is really fast, but for slow tasks like CFD, it works :) )
If you use the DifferentialEvolutionSolver (warning, it's private and subject to change), then you can step the solver very easily, it's an iterator. For a single run through it should be very rare that the fitness of a given vector is evaluated twice, only trial vectors are evaluated, and they're always novel. If you want to stop/restart then I understand the need to cache. TBH using very expensive objective functions doesn't sound great for differential_evolution though, there's always a lot of function evaluations.
Custom strategy_funcs can be of course used for things that have nothing
to do with genes. For example: sometimes - as the docs note - a user may want part of their candidates' data to be interpreted as integer data. How do you mutate or crossbreed integers and have them make sense? Well, that's really going to be task-dependent. Maybe the integer means "number of iterations" - if so, then perhaps simple interpolation is best. But maybe it's a category - in that case, interpolation is incoherent, and you should either keep it the same or randomly pick a new category. And if it is a category, and that category influences some other values in the candidate, then that may affect how you want to alter those values. Maybe if round(candidate[0]) is Category == 3, then you want the floating point value at candidate[1] to be between 1.0 and 10.0, but if it's Category == 5 then maybe you want candidate[1] to be between 1.0 and 5.0. Again, it's task dependent.
`differential_evolution` already has an integrality keyword.
w.r.t implementation, it just needs someone to do the programming and write tests. We always welcome new contributors for PRs.
A.
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: karen.robertsdottir@gmail.com
participants (3)
-
Andrew Nelson -
Karen Róbertsdóttir -
Ralf Gommers