[SciPy-User] tool for running simulations
Dan Goodman
dg.gmane at thesamovar.net
Fri Jun 17 14:38:38 EDT 2011
Hi all,
I have an idea for a tool that might be useful for people writing and
running scientific simulations. It might be that something like this
already exists, in which case if anyone has any good suggestions please
let me know! Otherwise, I might have a go at writing something, in which
case any feature requests or useful advice would be great.
Basically, the situation I keep finding myself in, and I assume many
others do so too, is that I have some rather complicated code to set up
and run a simulation (in my case, computational neuroscience
simulations). I typically want to run each simulation many times,
possibly with different parameters, and then do some averaging or more
complicated analysis at the end. Usually these simulations take around
1h to 1 week to run, depending on what I'm doing and assuming I'm using
multiple computers/CPUs to do it. The issue is that I want to be able to
run my code on several computers at once, and have the results available
on all the computers. I've been coming up with all sorts of annoying
ways to do this, for example having each computer generate one file with
a unique name, and then merging them afterwards - but this is quite tedious.
What I imagine is a tool that does something like this:
* Run a server process on each of several computers, that controls file
access (this avoids any issues with contention). One computer is the
master and if the other ones want to read or write a file then it is
transferred to the master. Some files might want to be cached/mirrored
on each computer for faster access (typically for read only files in my
case).
* Use a nice file format like HDF5 that allows fast access, store
metadata along with your data, and for which there are good tools to
browse the data. This is important because as you change your simulation
code, you might want to weed out some old data based on the metadata,
but not have to recompute everything, etc.
* Allows you to store multiple data entries (something like tables in
HDF5 I guess) and then select out specific ones for analysis.
* Allows you to use function cacheing. For example, I often have the
situation that I have a computation that takes about 10m for each set of
parameter values that is then used in several simulations. I'd like
these to be automatically cached (maybe based on a hash of the arguments
to the function).
As far as I can tell, there are tools to do each of the things above,
but nothing to combine them all together simply. For example, there are
lots of tools for distributed filesystems, for HDF5 and for function
value cacheing, but is there something that when you call a function
with some particular values, creates a hash, checks in a distributed
version of HDF5 for that hash value and then either returns the value or
stores it in the HDF5 file with the relevant metadata (maybe the values
of the arguments and not just the hash).
Since all the tools are basically already there, I don't think this
should take too long to write (maybe just a few days), but could be
useful for lots of people because at the moment it requires mastering
quite a few different tools and writing code to glue them together. The
key thing is to choose the best tools for the job and take the right
approach, so any ideas for that? Or maybe it's already been done?
Dan
More information about the SciPy-User
mailing list