This is in reply to Matt's e-mail from 3 weeks ago (I only just realised
I forgot to hit "confirm" on the yt-dev mailing list signup).
I guess one solution to the problem would be to abstract what a "grid"
is (I'm guessing a grid is a container for a geometrically consistent
chunk of the entire simulation volume?) Then allow it to answer queries
about its geometric properties itself. So for example, ask it
"myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to
figure out a flexible but simple interface for this, depending on how
well you know the requirements for what the grid should be able to do.
In general, I think this is the ideal situation, because as Matt says
hammering every code into the same structure in memory creates
slowdowns. One possibility is to create a few template memory
structures, etc, to allow people to bolt together new implementations
for each code.
In terms of choosing algorithms for different types of fluid blob (e.g.
one for particles, one for grids), this can be done using functionoids
for the algorithms (or at least functionoid wrappers) and then a
functionoid factory for spawning the correct functionoid to use with the
container. You'd have to wrap all this up in a simple interface again,
otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works
for all types of fluid blob (SPH particle, octree grid cell, voronoi
tessellation cell) but this might be very slow in Python. That said,
iterating over "grid"s as chunks of the amr grid instead is a
possibility. Having some kind of iterator option might be good, though,
as doing things like tracking particles through different snapshots is
something I've been doing extensively in my (pre-YT) work.
I don't know how much of this is already known; my domain is Ramses,
which is still very slow to use with my dataset (although Matthew has
been very helpful in working on the Ramses side of things). I thus
haven't looked too much at YT yet as it's still prohibitively slow to
load my dataset and play with it.
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk <matthewturk(a)gmail.com
This is a portion of a conversation Sam Geen and I had off-list about
where to make changes and how to insert abstractions to allow for
generalized geometric reading of data; this would be useful for octree
codes, particles codes, and non-rectilinear geometry. We decided to
"replay" the conversation on the mailing list to allow people to
contribute their ideas and thoughts. I spent a bit of time last night
looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection
for cells. If we get the creation of grids -- without including any
cell data inside them -- to be fast enough, this will not necessarily
need to be changed. (i.e., apply a 'regridding' step of empty grids.)
However, failing that, this will need to be abstracted into geometric
selection. For cylindrical coordinates this will need to be
abstracted anyway. The idea is that once you know which grids you
want, you read them from disk, and then mask out the points that are
* The IO is currently set up -- in parallel -- to read in chunks.
Usually in parallel patch-based simulations, multiple grid patches are
stored in a single file on disk. So, these get chunked in IO to avoid
too many fopen/seek/fclose operations (and the analogues in hdf5.)
This will need to be rethought. Obviously, there are still some
analogues; however, it's not clear how -- without the actual
re-gridding operation -- to keep the geometry selection and the IO
separate. I would prefer to try to do this as much as possible. I
think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly
necessary evil *at the moment*, but only for guiding the point
selection. It's currently been re-written to be based on hilbert
curve locating, so each grid has a unique index in L-8 or something
I believe that geometry and chunking of IO are the only issues at this
time. One possibility would actually be to move away from the idea of
grids and instead of 'hilbert chunks'. So these would be the items
that would be selected, read from disk, and mapped. This might fit
nicer with the Ramses method.
What do you think?