Re: [Yt-dev] Geometry, RAMSES and non-patch datasets
Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised
I forgot to hit "confirm" on the yt-dev mailing list signup).
I guess one solution to the problem would be to abstract what a "grid"
is (I'm guessing a grid is a container for a geometrically consistent
chunk of the entire simulation volume?) Then allow it to answer queries
about its geometric properties itself. So for example, ask it
"myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to
figure out a flexible but simple interface for this, depending on how
well you know the requirements for what the grid should be able to do.
In general, I think this is the ideal situation, because as Matt says
hammering every code into the same structure in memory creates
slowdowns. One possibility is to create a few template memory
structures, etc, to allow people to bolt together new implementations
for each code.
In terms of choosing algorithms for different types of fluid blob (e.g.
one for particles, one for grids), this can be done using functionoids
for the algorithms (or at least functionoid wrappers) and then a
functionoid factory for spawning the correct functionoid to use with the
container. You'd have to wrap all this up in a simple interface again,
otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works
for all types of fluid blob (SPH particle, octree grid cell, voronoi
tessellation cell) but this might be very slow in Python. That said,
iterating over "grid"s as chunks of the amr grid instead is a
possibility. Having some kind of iterator option might be good, though,
as doing things like tracking particles through different snapshots is
something I've been doing extensively in my (pre-YT) work.
I don't know how much of this is already known; my domain is Ramses,
which is still very slow to use with my dataset (although Matthew has
been very helpful in working on the Ramses side of things). I thus
haven't looked too much at YT yet as it's still prohibitively slow to
load my dataset and play with it.
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup).
No worries! :)
I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code.
A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like: field1*field2 the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost. Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is: * Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them. I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work.
A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection. (Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it.
I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer. Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations... -Matt
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
wrote: Hi all,
This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space.
I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method.
What do you think?
Best,
Matt
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off. Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are? Sam Matthew Turk wrote:
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
wrote: Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup).
No worries! :)
I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code.
A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like:
field1*field2
the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost.
Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is:
* Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them.
I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work.
A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection.
(Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it.
I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer.
Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations...
-Matt
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
wrote: Hi all,
This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space.
I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method.
What do you think?
Best,
Matt
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off.
There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes: import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof") it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are?
Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too. As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn. The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this. Thanks for your thoughts on this. -Matt
Sam
Matthew Turk wrote:
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
wrote: Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup).
No worries! :)
I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code.
A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like:
field1*field2
the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost.
Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is:
* Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them.
I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work.
A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection.
(Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it.
I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer.
Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations...
-Matt
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
wrote: Hi all,
This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space.
I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method.
What do you think?
Best,
Matt
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python. I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools. Sam On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off. There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are? Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
Sam
Matthew Turk wrote:
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
wrote: Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup).
No worries! :)
I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code.
A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like:
field1*field2
the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost.
Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is:
* Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them.
I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work.
A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection.
(Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it.
I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer.
Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations...
-Matt
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
wrote: Hi all,
This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space.
I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method.
What do you think?
Best,
Matt
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Sam,
I'm glad you're interested in helping to develop yt's Ramses
facilities. yt is a community code, and we're always excited to
embiggen that community. However, I think you might want to take a
step back and look at what yt already is and does.
yt is highly optimized already for patch based AMR. We have thought
through these concepts in significant detail, and over the past nearly
5 years we have refactored and optimized the loading of data many
times. I don't know if this is your intention, but you seem to be
suggesting that we have not thought about iterators and where the
bottlenecks lie in our analysis to data pipeline. We did not find it
necessary to jump to C often (we do not use C++ because of portability
issues on some of the supercomputing centers we deploy on). Your
emails seem to suggest you believe a rewrite of many core routines in
C/C++ would be necessary. I think that the Ramses reader could be
brought to speeds comparable to the enzo/orion readers by optimizing
the approach already pursued by those yt frontends. Python has
extremely powerful object orientation; the need for C++ templates is
obviated by those features, and we have already demonstrated that a
small amount of C can go a very long way for speed.
Again, I want to reiterate that we are excited that you are interested
in contributing; I myself am working with a group using Ramses, and it
would be great to have an optimized frontend. I just want to impress
upon you that there is a lot of people in the yt community, and we
have indeed thought many of these issues through. It might be helpful
to take a look at how the code works (there is a sample enzo dataset
in the distribution), and feel free to ask questions on the dev list,
on the user list, and on #yt, if you're into the whole IRC thing.
sincerely,
Jeff Oishi
On Wed, Jun 22, 2011 at 8:02 AM, Sam Geen
So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python.
I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools.
Sam
On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off.
There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are?
Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
Sam
Matthew Turk wrote:
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
wrote: Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup).
No worries! :)
I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code.
A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like:
field1*field2
the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost.
Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is:
* Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them.
I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work.
A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection.
(Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it.
I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer.
Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations...
-Matt
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
wrote: Hi all,
This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space.
I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method.
What do you think?
Best,
Matt
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Jeff, Thanks for your e-mail. One of the things I've been wanting to do with these e-mails is to determine what I can add to the project, rather than jumping in and adding something that either doesn't fit in with the structure of the rest of the code or that is replicating what someone else has already tried or implemented. Part of what I've said arose from discussions with Matthew about how best to optimise the Ramses frontend to remove a lot of the existing overheads, and a lot of what I've suggested is simply my way of figuring out my way around the problem. I definitely agree that the best approach is to use the existing interface that YT has with its I/O frontends. The reason I was holding off from doing this is because Matthew suggested that he might be changing the way YT interfaces with its I/O routines, but I don't know how far away this would be. I am conscious that you guys have far more experience with these sorts of issues than I do, which is why I'm airing them here rather than trying them and finding out problems with them that someone else has already discovered. The main issue I'm facing is how to deal with an octree-based code (Ramses) rather than a patch-based code (Enzo - at least this is my understanding of it), and I don't know to what extent these issues have been discussed or which ideas have been found to work/not work. If you have any suggestions for threads to follow from the yt-dev archive or similar design documents then that would be very much appreciated. Something like Flash is also octree-based, I guess, although it does have patches on some level (I guess an analogy is that Ramses has 2x2x2 patches rather than NxNxN). Your suggestion that I play with the sample Enzo dataset is a good one, and sounds like a good way of becoming familiar with how YT operates. I'll try that before I jump in and start fiddling with the Ramses implementation. Like I said, I'm new to YT, so any and all suggestions are welcome. Equally, I'm keen to know what work already exists with the Ramses frontend, so that I don't tread on any toes or reinvent too many wheels. My main goal is to be able to analyse very large Ramses datasets efficiently, so I'm keen to work with the YT community to achieve this. Cheers, Sam On 22/06/2011 18:06, j s oishi wrote:
Hi Sam,
I'm glad you're interested in helping to develop yt's Ramses facilities. yt is a community code, and we're always excited to embiggen that community. However, I think you might want to take a step back and look at what yt already is and does.
yt is highly optimized already for patch based AMR. We have thought through these concepts in significant detail, and over the past nearly 5 years we have refactored and optimized the loading of data many times. I don't know if this is your intention, but you seem to be suggesting that we have not thought about iterators and where the bottlenecks lie in our analysis to data pipeline. We did not find it necessary to jump to C often (we do not use C++ because of portability issues on some of the supercomputing centers we deploy on). Your emails seem to suggest you believe a rewrite of many core routines in C/C++ would be necessary. I think that the Ramses reader could be brought to speeds comparable to the enzo/orion readers by optimizing the approach already pursued by those yt frontends. Python has extremely powerful object orientation; the need for C++ templates is obviated by those features, and we have already demonstrated that a small amount of C can go a very long way for speed.
Again, I want to reiterate that we are excited that you are interested in contributing; I myself am working with a group using Ramses, and it would be great to have an optimized frontend. I just want to impress upon you that there is a lot of people in the yt community, and we have indeed thought many of these issues through. It might be helpful to take a look at how the code works (there is a sample enzo dataset in the distribution), and feel free to ask questions on the dev list, on the user list, and on #yt, if you're into the whole IRC thing.
sincerely,
Jeff Oishi
On Wed, Jun 22, 2011 at 8:02 AM, Sam Geen
wrote: So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python.
I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools.
Sam
On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off. There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are? Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
Sam
Matthew Turk wrote:
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
wrote: Hi,
This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup).
No worries! :)
I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code.
A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like:
field1*field2
the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost.
Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is:
* Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them.
I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use.
I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work.
A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection.
(Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it.
I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer.
Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations...
-Matt
Cheers,
Sam
On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk
wrote: Hi all,
This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt.
Right now I see a few places this will need to be fixed:
* Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it.
My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space.
I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method.
What do you think?
Best,
Matt
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Also, if anyone is curious, these are some profiler test results for one of our Ramses datasets (size 7.7GB), using the stable release YT code as of 6th June. If anyone is curious I can send you some more info (e.g. number of cells at each refinement level, etc). Load (i.e. RAMSESStaticOutput): http://www-astro.physics.ox.ac.uk/~GeenS/testload/loadprof/ Density slice: http://www-astro.physics.ox.ac.uk/~GeenS/testload/densslice/ So my main issue is that the regridding (I think) is taking 30mins+, which needs an order of magnitude or two more speed-up before I can start using YT as a analysis framework full-time. If people think that this can be achieved with the current interface and without substantial changes beyond the Ramses frontend, then this would be great. Sam On 22/06/11 18:41, Sam Geen wrote:
Hi Jeff,
Thanks for your e-mail. One of the things I've been wanting to do with these e-mails is to determine what I can add to the project, rather than jumping in and adding something that either doesn't fit in with the structure of the rest of the code or that is replicating what someone else has already tried or implemented. Part of what I've said arose from discussions with Matthew about how best to optimise the Ramses frontend to remove a lot of the existing overheads, and a lot of what I've suggested is simply my way of figuring out my way around the problem.
I definitely agree that the best approach is to use the existing interface that YT has with its I/O frontends. The reason I was holding off from doing this is because Matthew suggested that he might be changing the way YT interfaces with its I/O routines, but I don't know how far away this would be. I am conscious that you guys have far more experience with these sorts of issues than I do, which is why I'm airing them here rather than trying them and finding out problems with them that someone else has already discovered.
The main issue I'm facing is how to deal with an octree-based code (Ramses) rather than a patch-based code (Enzo - at least this is my understanding of it), and I don't know to what extent these issues have been discussed or which ideas have been found to work/not work. If you have any suggestions for threads to follow from the yt-dev archive or similar design documents then that would be very much appreciated. Something like Flash is also octree-based, I guess, although it does have patches on some level (I guess an analogy is that Ramses has 2x2x2 patches rather than NxNxN).
Your suggestion that I play with the sample Enzo dataset is a good one, and sounds like a good way of becoming familiar with how YT operates. I'll try that before I jump in and start fiddling with the Ramses implementation. Like I said, I'm new to YT, so any and all suggestions are welcome. Equally, I'm keen to know what work already exists with the Ramses frontend, so that I don't tread on any toes or reinvent too many wheels. My main goal is to be able to analyse very large Ramses datasets efficiently, so I'm keen to work with the YT community to achieve this.
Cheers,
Sam
On 22/06/2011 18:06, j s oishi wrote:
Hi Sam,
I'm glad you're interested in helping to develop yt's Ramses facilities. yt is a community code, and we're always excited to embiggen that community. However, I think you might want to take a step back and look at what yt already is and does.
yt is highly optimized already for patch based AMR. We have thought through these concepts in significant detail, and over the past nearly 5 years we have refactored and optimized the loading of data many times. I don't know if this is your intention, but you seem to be suggesting that we have not thought about iterators and where the bottlenecks lie in our analysis to data pipeline. We did not find it necessary to jump to C often (we do not use C++ because of portability issues on some of the supercomputing centers we deploy on). Your emails seem to suggest you believe a rewrite of many core routines in C/C++ would be necessary. I think that the Ramses reader could be brought to speeds comparable to the enzo/orion readers by optimizing the approach already pursued by those yt frontends. Python has extremely powerful object orientation; the need for C++ templates is obviated by those features, and we have already demonstrated that a small amount of C can go a very long way for speed.
Again, I want to reiterate that we are excited that you are interested in contributing; I myself am working with a group using Ramses, and it would be great to have an optimized frontend. I just want to impress upon you that there is a lot of people in the yt community, and we have indeed thought many of these issues through. It might be helpful to take a look at how the code works (there is a sample enzo dataset in the distribution), and feel free to ask questions on the dev list, on the user list, and on #yt, if you're into the whole IRC thing.
sincerely,
Jeff Oishi
On Wed, Jun 22, 2011 at 8:02 AM, Sam Geen
wrote: So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python.
I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools.
Sam
On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off. There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are? Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
Sam
Matthew Turk wrote:
Hi Sam,
On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
wrote: > Hi, > > This is in reply to Matt's e-mail from 3 weeks ago (I only just > realised > I > forgot to hit "confirm" on the yt-dev mailing list signup). > No worries! :)
> I guess one solution to the problem would be to abstract what a > "grid" > is > (I'm guessing a grid is a container for a geometrically consistent > chunk > of > the entire simulation volume?) Then allow it to answer queries > about > its > geometric properties itself. So for example, ask it > "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the > trick is to > figure out a flexible but simple interface for this, depending > on how > well > you know the requirements for what the grid should be able to > do. In > general, I think this is the ideal situation, because as Matt says > hammering > every code into the same structure in memory creates slowdowns. One > possibility is to create a few template memory structures, etc, to > allow > people to bolt together new implementations for each code. > A grid is indeed a container for a chunk of the simulation -- typically in patch-based AMR codes, these will be some (hopefully large but not too large) contiguous region. This enables numpy to take over, as it helps batch mathematical operations -- for instance, for an operation like:
field1*field2
the startup cost of parsing, identifying the object as a buffer of contiguous data, identifying the types, dispatching the correct function, and then allocating and returning a new buffer is the startup cost against which the actual operation of multiplication is weighed. The batching of operations with grids nicely coincides with reducing the ratio between startup to operation cost.
Right now, the mechanism for geometric constructs is inverted from how you describe -- when describing a sphere, for instance, the operation is:
* Query the Hierarchy object (which I would support renaming to 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) to identify grids that intersect the geometric region. This is accomplished through a "geometry mixin" that supplies various routines to do this. * Query each intersecting grid's x,y,z values (for each cell) to identify which cells intersect the region. * Return these values to the routine/user that requested them.
I think that this is compatible with what you have outlined, in general. The issue I had hoped to avoid was to reduce the interaction between IO and geometry as much as possible, simply because IO routines are usually compiled code, whereas ideally I would like the geometry to be performed in Python. (As it stands it's usually done with operations on bounding values of grids to find intersections.)
> In terms of choosing algorithms for different types of fluid > blob (e.g. > one > for particles, one for grids), this can be done using > functionoids for > the > algorithms (or at least functionoid wrappers) and then a > functionoid > factory > for spawning the correct functionoid to use with the container. > You'd > have > to wrap all this up in a simple interface again, otherwise it'd be > impossible to use. > > I also suggested to Matt to create a "fluid blob" iterator that > works > for > all types of fluid blob (SPH particle, octree grid cell, voronoi > tessellation cell) but this might be very slow in Python. That > said, > iterating over "grid"s as chunks of the amr grid instead is a > possibility. > Having some kind of iterator option might be good, though, as doing > things > like tracking particles through different snapshots is something > I've > been > doing extensively in my (pre-YT) work. > A generalized fluid blob iterator would be interesting; I think into this the grids could be placed. By extending the geometry mixin to work with different methods, this could be feasible. I wonder if perhaps rethinking the idea of static geometries (determined at instantiation) would assist with addressing SPH data. I am inclined to think this would be a way forward. In looking over the code, it's not clear to me that there are many places that grids are assumed except in the projections and the first-pass of data selection.
(Projections as we do them now might port nicely to SPH, but it's not yet clear to me.)
> I don't know how much of this is already known; my domain is > Ramses, > which > is still very slow to use with my dataset (although Matthew has > been > very > helpful in working on the Ramses side of things). I thus haven't > looked > too > much at YT yet as it's still prohibitively slow to load my > dataset and > play > with it. > I did manage to squeeze out what for me was an OOM improvement in RAMSES data instantiation, but I confess it is still slow. And there are other issues with it. Right now Casey and I are refactoring fields, and I have set up a testing infrastructure, so I am feeling bit more inclined to try more invasive changes after branching into 2.3 and 3.0 branches sometime later this summer.
Perhaps moving to a generalized geometry, into which the standard patch/block AMR "hierarchy" paradigm would fit, would meet the necessary needs to do generalized fluid operations...
-Matt
> Cheers, > > Sam > > On Tue, Jun 7, 2011 at 16:15 AM, Matthew > Turk
> wrote: > > Hi all, > > This is a portion of a conversation Sam Geen and I had off-list > about > where to make changes and how to insert abstractions to allow for > generalized geometric reading of data; this would be useful for > octree > codes, particles codes, and non-rectilinear geometry. We > decided to > "replay" the conversation on the mailing list to allow people to > contribute their ideas and thoughts. I spent a bit of time last > night > looking at the geometry usage in yt. > > Right now I see a few places this will need to be fixed: > > * Data sources operate on the idea that grids act as a > pre-selection > for cells. If we get the creation of grids -- without including > any > cell data inside them -- to be fast enough, this will not > necessarily > need to be changed. (i.e., apply a 'regridding' step of empty > grids.) > However, failing that, this will need to be abstracted into > geometric > selection. For cylindrical coordinates this will need to be > abstracted anyway. The idea is that once you know which grids you > want, you read them from disk, and then mask out the points that > are > not necessary. > * The IO is currently set up -- in parallel -- to read in chunks. > Usually in parallel patch-based simulations, multiple grid > patches are > stored in a single file on disk. So, these get chunked in IO to > avoid > too many fopen/seek/fclose operations (and the analogues in hdf5.) > This will need to be rethought. Obviously, there are still some > analogues; however, it's not clear how -- without the actual > re-gridding operation -- to keep the geometry selection and the IO > separate. I would prefer to try to do this as much as possible. I > think it's do-able, but I don't yet have a good strategy for it. > > My current feeling now is that the re-gridding may be a slightly > necessary evil *at the moment*, but only for guiding the point > selection. It's currently been re-written to be based on hilbert > curve locating, so each grid has a unique index in L-8 or something > space. > > I believe that geometry and chunking of IO are the only issues > at this > time. One possibility would actually be to move away from the > idea of > grids and instead of 'hilbert chunks'. So these would be the items > that would be selected, read from disk, and mapped. This might fit > nicer with the Ramses method. > > What do you think? > > Best, > > Matt > > _______________________________________________ > Yt-dev mailing list > Yt-dev@lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org > > > _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Sam,
We definitely appreciate your enthusiasm and interest. I think that
we have reached the point in growing the community, and the user base,
that it is time to revisit some design decisions. Part of that is the
idea of IO, of geometric selection, and of selecting fluid blobs --
grids, sections of the hilbert curve, particles and voronoi mesh items
-- to operate on. I'm personally very excited about this, even if a
bit hesitant (as I think is natural!) to be too invasive with the
changes. Again, though, this is important, particularly as we grow as
a community. The perspective you bring is definitely important, and I
am interested in making sure that we can find a way to move forward
together.
The results you show are very similar to what I was seeing. I have
just run a test on an output I found, output_0045, the provenance of
which I cannot recall. It's about 9.3 gigs. I ran profiling of dev
versus stable tip:
http://yt.enzotools.org/ramses_prof/
The all-told time for dev was 47 seconds or so to regrid and construct
the final grids in memory; the all-told time for stable was ~286
seconds. This isn't quite an OOM improvement, but it is something.
Unfortunately it looks like the dev tip won't be put out as a
"release" for a month and a half at least. That being said, I'm still
not terribly happy with 47 seconds just to parse and regrid a mesh.
Removing the hilbert indices call (which I think we could, by using
what's in the files, but I did not know how to access that
information) and rethinking the way get_array_indices_lists is used
(which stores the IO information for individual cells) could improve
the situation dramatically. For instance, if we were to remove the
storage of cell information within a grid object, and instead
recalculate on the fly individual cells inside the regridded objects,
we could probably speed this up by a factor of two or more.
Anyway, I'm going to try to work on this a bit next week at least
writing up how the code currently operates and adding comments as
necessary. I will ping the list as necessary. But I do very much
appreciate your ideas, and your experience, and I share your concerns
about how it currently operates. Despite the improvements, I think it
has a ways to go. I'll see what I can do about describing where I see
its operation and its shortcomings before I go on vacation.
Best,
Matt
On Wed, Jun 22, 2011 at 3:56 PM, Sam Geen
Also, if anyone is curious, these are some profiler test results for one of our Ramses datasets (size 7.7GB), using the stable release YT code as of 6th June. If anyone is curious I can send you some more info (e.g. number of cells at each refinement level, etc).
Load (i.e. RAMSESStaticOutput): http://www-astro.physics.ox.ac.uk/~GeenS/testload/loadprof/ Density slice: http://www-astro.physics.ox.ac.uk/~GeenS/testload/densslice/
So my main issue is that the regridding (I think) is taking 30mins+, which needs an order of magnitude or two more speed-up before I can start using YT as a analysis framework full-time. If people think that this can be achieved with the current interface and without substantial changes beyond the Ramses frontend, then this would be great.
Sam
On 22/06/11 18:41, Sam Geen wrote:
Hi Jeff,
Thanks for your e-mail. One of the things I've been wanting to do with these e-mails is to determine what I can add to the project, rather than jumping in and adding something that either doesn't fit in with the structure of the rest of the code or that is replicating what someone else has already tried or implemented. Part of what I've said arose from discussions with Matthew about how best to optimise the Ramses frontend to remove a lot of the existing overheads, and a lot of what I've suggested is simply my way of figuring out my way around the problem.
I definitely agree that the best approach is to use the existing interface that YT has with its I/O frontends. The reason I was holding off from doing this is because Matthew suggested that he might be changing the way YT interfaces with its I/O routines, but I don't know how far away this would be. I am conscious that you guys have far more experience with these sorts of issues than I do, which is why I'm airing them here rather than trying them and finding out problems with them that someone else has already discovered.
The main issue I'm facing is how to deal with an octree-based code (Ramses) rather than a patch-based code (Enzo - at least this is my understanding of it), and I don't know to what extent these issues have been discussed or which ideas have been found to work/not work. If you have any suggestions for threads to follow from the yt-dev archive or similar design documents then that would be very much appreciated. Something like Flash is also octree-based, I guess, although it does have patches on some level (I guess an analogy is that Ramses has 2x2x2 patches rather than NxNxN).
Your suggestion that I play with the sample Enzo dataset is a good one, and sounds like a good way of becoming familiar with how YT operates. I'll try that before I jump in and start fiddling with the Ramses implementation. Like I said, I'm new to YT, so any and all suggestions are welcome. Equally, I'm keen to know what work already exists with the Ramses frontend, so that I don't tread on any toes or reinvent too many wheels. My main goal is to be able to analyse very large Ramses datasets efficiently, so I'm keen to work with the YT community to achieve this.
Cheers,
Sam
On 22/06/2011 18:06, j s oishi wrote:
Hi Sam,
I'm glad you're interested in helping to develop yt's Ramses facilities. yt is a community code, and we're always excited to embiggen that community. However, I think you might want to take a step back and look at what yt already is and does.
yt is highly optimized already for patch based AMR. We have thought through these concepts in significant detail, and over the past nearly 5 years we have refactored and optimized the loading of data many times. I don't know if this is your intention, but you seem to be suggesting that we have not thought about iterators and where the bottlenecks lie in our analysis to data pipeline. We did not find it necessary to jump to C often (we do not use C++ because of portability issues on some of the supercomputing centers we deploy on). Your emails seem to suggest you believe a rewrite of many core routines in C/C++ would be necessary. I think that the Ramses reader could be brought to speeds comparable to the enzo/orion readers by optimizing the approach already pursued by those yt frontends. Python has extremely powerful object orientation; the need for C++ templates is obviated by those features, and we have already demonstrated that a small amount of C can go a very long way for speed.
Again, I want to reiterate that we are excited that you are interested in contributing; I myself am working with a group using Ramses, and it would be great to have an optimized frontend. I just want to impress upon you that there is a lot of people in the yt community, and we have indeed thought many of these issues through. It might be helpful to take a look at how the code works (there is a sample enzo dataset in the distribution), and feel free to ask questions on the dev list, on the user list, and on #yt, if you're into the whole IRC thing.
sincerely,
Jeff Oishi
On Wed, Jun 22, 2011 at 8:02 AM, Sam Geen
wrote: So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python.
I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools.
Sam
On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off.
There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are?
Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
Sam
Matthew Turk wrote: > > Hi Sam, > > On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
> wrote: > >> Hi, >> >> This is in reply to Matt's e-mail from 3 weeks ago (I only just >> realised >> I >> forgot to hit "confirm" on the yt-dev mailing list signup). >> > No worries! :) > > >> I guess one solution to the problem would be to abstract what a >> "grid" >> is >> (I'm guessing a grid is a container for a geometrically consistent >> chunk >> of >> the entire simulation volume?) Then allow it to answer queries about >> its >> geometric properties itself. So for example, ask it >> "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is >> to >> figure out a flexible but simple interface for this, depending on >> how >> well >> you know the requirements for what the grid should be able to do. In >> general, I think this is the ideal situation, because as Matt says >> hammering >> every code into the same structure in memory creates slowdowns. One >> possibility is to create a few template memory structures, etc, to >> allow >> people to bolt together new implementations for each code. >> > A grid is indeed a container for a chunk of the simulation -- > typically in patch-based AMR codes, these will be some (hopefully > large but not too large) contiguous region. This enables numpy to > take over, as it helps batch mathematical operations -- for instance, > for an operation like: > > field1*field2 > > the startup cost of parsing, identifying the object as a buffer of > contiguous data, identifying the types, dispatching the correct > function, and then allocating and returning a new buffer is the > startup cost against which the actual operation of multiplication is > weighed. The batching of operations with grids nicely coincides with > reducing the ratio between startup to operation cost. > > Right now, the mechanism for geometric constructs is inverted from > how > you describe -- when describing a sphere, for instance, the operation > is: > > * Query the Hierarchy object (which I would support renaming to > 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) > to > identify grids that intersect the geometric region. This is > accomplished through a "geometry mixin" that supplies various > routines > to do this. > * Query each intersecting grid's x,y,z values (for each cell) to > identify which cells intersect the region. > * Return these values to the routine/user that requested them. > > I think that this is compatible with what you have outlined, in > general. The issue I had hoped to avoid was to reduce the > interaction > between IO and geometry as much as possible, simply because IO > routines are usually compiled code, whereas ideally I would like the > geometry to be performed in Python. (As it stands it's usually done > with operations on bounding values of grids to find intersections.) > > >> In terms of choosing algorithms for different types of fluid blob >> (e.g. >> one >> for particles, one for grids), this can be done using functionoids >> for >> the >> algorithms (or at least functionoid wrappers) and then a functionoid >> factory >> for spawning the correct functionoid to use with the container. >> You'd >> have >> to wrap all this up in a simple interface again, otherwise it'd be >> impossible to use. >> >> I also suggested to Matt to create a "fluid blob" iterator that >> works >> for >> all types of fluid blob (SPH particle, octree grid cell, voronoi >> tessellation cell) but this might be very slow in Python. That said, >> iterating over "grid"s as chunks of the amr grid instead is a >> possibility. >> Having some kind of iterator option might be good, though, as doing >> things >> like tracking particles through different snapshots is something >> I've >> been >> doing extensively in my (pre-YT) work. >> > A generalized fluid blob iterator would be interesting; I think into > this the grids could be placed. By extending the geometry mixin to > work with different methods, this could be feasible. I wonder if > perhaps rethinking the idea of static geometries (determined at > instantiation) would assist with addressing SPH data. I am inclined > to think this would be a way forward. In looking over the code, it's > not clear to me that there are many places that grids are assumed > except in the projections and the first-pass of data selection. > > (Projections as we do them now might port nicely to SPH, but it's not > yet clear to me.) > > >> I don't know how much of this is already known; my domain is Ramses, >> which >> is still very slow to use with my dataset (although Matthew has been >> very >> helpful in working on the Ramses side of things). I thus haven't >> looked >> too >> much at YT yet as it's still prohibitively slow to load my dataset >> and >> play >> with it. >> > I did manage to squeeze out what for me was an OOM improvement in > RAMSES data instantiation, but I confess it is still slow. And there > are other issues with it. Right now Casey and I are refactoring > fields, and I have set up a testing infrastructure, so I am feeling > bit more inclined to try more invasive changes after branching into > 2.3 and 3.0 branches sometime later this summer. > > Perhaps moving to a generalized geometry, into which the standard > patch/block AMR "hierarchy" paradigm would fit, would meet the > necessary needs to do generalized fluid operations... > > -Matt > > >> Cheers, >> >> Sam >> >> On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk >> wrote: >> >> Hi all, >> >> This is a portion of a conversation Sam Geen and I had off-list >> about >> where to make changes and how to insert abstractions to allow for >> generalized geometric reading of data; this would be useful for >> octree >> codes, particles codes, and non-rectilinear geometry. We decided to >> "replay" the conversation on the mailing list to allow people to >> contribute their ideas and thoughts. I spent a bit of time last >> night >> looking at the geometry usage in yt. >> >> Right now I see a few places this will need to be fixed: >> >> * Data sources operate on the idea that grids act as a >> pre-selection >> for cells. If we get the creation of grids -- without including any >> cell data inside them -- to be fast enough, this will not >> necessarily >> need to be changed. (i.e., apply a 'regridding' step of empty >> grids.) >> However, failing that, this will need to be abstracted into >> geometric >> selection. For cylindrical coordinates this will need to be >> abstracted anyway. The idea is that once you know which grids you >> want, you read them from disk, and then mask out the points that are >> not necessary. >> * The IO is currently set up -- in parallel -- to read in chunks. >> Usually in parallel patch-based simulations, multiple grid patches >> are >> stored in a single file on disk. So, these get chunked in IO to >> avoid >> too many fopen/seek/fclose operations (and the analogues in hdf5.) >> This will need to be rethought. Obviously, there are still some >> analogues; however, it's not clear how -- without the actual >> re-gridding operation -- to keep the geometry selection and the IO >> separate. I would prefer to try to do this as much as possible. I >> think it's do-able, but I don't yet have a good strategy for it. >> >> My current feeling now is that the re-gridding may be a slightly >> necessary evil *at the moment*, but only for guiding the point >> selection. It's currently been re-written to be based on hilbert >> curve locating, so each grid has a unique index in L-8 or something >> space. >> >> I believe that geometry and chunking of IO are the only issues at >> this >> time. One possibility would actually be to move away from the idea >> of >> grids and instead of 'hilbert chunks'. So these would be the items >> that would be selected, read from disk, and mapped. This might fit >> nicer with the Ramses method. >> >> What do you think? >> >> Best, >> >> Matt >> >> _______________________________________________ >> Yt-dev mailing list >> Yt-dev@lists.spacepope.org >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org >> >> >> > _______________________________________________ > Yt-dev mailing list > Yt-dev@lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org > _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
The times you're getting on the 9.3gig set sound encouraging, and it sounds like it's getting to the point where changes to the Ramses code that don't require any drastic changes are getting somewhere. I'll download the latest release once you're happy with the documentation, etc, and have a play. It's not drastically urgent to have a lightning-fast Ramses implementation, but I'd be interested in working with people to create one in the future.
Also: I'm guessing that Ramses still doesn't have particle support? This is something that's quite important in my research so I could look at that if it would be useful.
Sam
________________________________________
From: yt-dev-bounces@lists.spacepope.org [yt-dev-bounces@lists.spacepope.org] on behalf of Matthew Turk [matthewturk@gmail.com]
Sent: 23 June 2011 00:35
To: yt-dev@lists.spacepope.org
Subject: Re: [Yt-dev] Geometry, RAMSES and non-patch datasets
Hi Sam,
We definitely appreciate your enthusiasm and interest. I think that
we have reached the point in growing the community, and the user base,
that it is time to revisit some design decisions. Part of that is the
idea of IO, of geometric selection, and of selecting fluid blobs --
grids, sections of the hilbert curve, particles and voronoi mesh items
-- to operate on. I'm personally very excited about this, even if a
bit hesitant (as I think is natural!) to be too invasive with the
changes. Again, though, this is important, particularly as we grow as
a community. The perspective you bring is definitely important, and I
am interested in making sure that we can find a way to move forward
together.
The results you show are very similar to what I was seeing. I have
just run a test on an output I found, output_0045, the provenance of
which I cannot recall. It's about 9.3 gigs. I ran profiling of dev
versus stable tip:
http://yt.enzotools.org/ramses_prof/
The all-told time for dev was 47 seconds or so to regrid and construct
the final grids in memory; the all-told time for stable was ~286
seconds. This isn't quite an OOM improvement, but it is something.
Unfortunately it looks like the dev tip won't be put out as a
"release" for a month and a half at least. That being said, I'm still
not terribly happy with 47 seconds just to parse and regrid a mesh.
Removing the hilbert indices call (which I think we could, by using
what's in the files, but I did not know how to access that
information) and rethinking the way get_array_indices_lists is used
(which stores the IO information for individual cells) could improve
the situation dramatically. For instance, if we were to remove the
storage of cell information within a grid object, and instead
recalculate on the fly individual cells inside the regridded objects,
we could probably speed this up by a factor of two or more.
Anyway, I'm going to try to work on this a bit next week at least
writing up how the code currently operates and adding comments as
necessary. I will ping the list as necessary. But I do very much
appreciate your ideas, and your experience, and I share your concerns
about how it currently operates. Despite the improvements, I think it
has a ways to go. I'll see what I can do about describing where I see
its operation and its shortcomings before I go on vacation.
Best,
Matt
On Wed, Jun 22, 2011 at 3:56 PM, Sam Geen
Also, if anyone is curious, these are some profiler test results for one of our Ramses datasets (size 7.7GB), using the stable release YT code as of 6th June. If anyone is curious I can send you some more info (e.g. number of cells at each refinement level, etc).
Load (i.e. RAMSESStaticOutput): http://www-astro.physics.ox.ac.uk/~GeenS/testload/loadprof/ Density slice: http://www-astro.physics.ox.ac.uk/~GeenS/testload/densslice/
So my main issue is that the regridding (I think) is taking 30mins+, which needs an order of magnitude or two more speed-up before I can start using YT as a analysis framework full-time. If people think that this can be achieved with the current interface and without substantial changes beyond the Ramses frontend, then this would be great.
Sam
On 22/06/11 18:41, Sam Geen wrote:
Hi Jeff,
Thanks for your e-mail. One of the things I've been wanting to do with these e-mails is to determine what I can add to the project, rather than jumping in and adding something that either doesn't fit in with the structure of the rest of the code or that is replicating what someone else has already tried or implemented. Part of what I've said arose from discussions with Matthew about how best to optimise the Ramses frontend to remove a lot of the existing overheads, and a lot of what I've suggested is simply my way of figuring out my way around the problem.
I definitely agree that the best approach is to use the existing interface that YT has with its I/O frontends. The reason I was holding off from doing this is because Matthew suggested that he might be changing the way YT interfaces with its I/O routines, but I don't know how far away this would be. I am conscious that you guys have far more experience with these sorts of issues than I do, which is why I'm airing them here rather than trying them and finding out problems with them that someone else has already discovered.
The main issue I'm facing is how to deal with an octree-based code (Ramses) rather than a patch-based code (Enzo - at least this is my understanding of it), and I don't know to what extent these issues have been discussed or which ideas have been found to work/not work. If you have any suggestions for threads to follow from the yt-dev archive or similar design documents then that would be very much appreciated. Something like Flash is also octree-based, I guess, although it does have patches on some level (I guess an analogy is that Ramses has 2x2x2 patches rather than NxNxN).
Your suggestion that I play with the sample Enzo dataset is a good one, and sounds like a good way of becoming familiar with how YT operates. I'll try that before I jump in and start fiddling with the Ramses implementation. Like I said, I'm new to YT, so any and all suggestions are welcome. Equally, I'm keen to know what work already exists with the Ramses frontend, so that I don't tread on any toes or reinvent too many wheels. My main goal is to be able to analyse very large Ramses datasets efficiently, so I'm keen to work with the YT community to achieve this.
Cheers,
Sam
On 22/06/2011 18:06, j s oishi wrote:
Hi Sam,
I'm glad you're interested in helping to develop yt's Ramses facilities. yt is a community code, and we're always excited to embiggen that community. However, I think you might want to take a step back and look at what yt already is and does.
yt is highly optimized already for patch based AMR. We have thought through these concepts in significant detail, and over the past nearly 5 years we have refactored and optimized the loading of data many times. I don't know if this is your intention, but you seem to be suggesting that we have not thought about iterators and where the bottlenecks lie in our analysis to data pipeline. We did not find it necessary to jump to C often (we do not use C++ because of portability issues on some of the supercomputing centers we deploy on). Your emails seem to suggest you believe a rewrite of many core routines in C/C++ would be necessary. I think that the Ramses reader could be brought to speeds comparable to the enzo/orion readers by optimizing the approach already pursued by those yt frontends. Python has extremely powerful object orientation; the need for C++ templates is obviated by those features, and we have already demonstrated that a small amount of C can go a very long way for speed.
Again, I want to reiterate that we are excited that you are interested in contributing; I myself am working with a group using Ramses, and it would be great to have an optimized frontend. I just want to impress upon you that there is a lot of people in the yt community, and we have indeed thought many of these issues through. It might be helpful to take a look at how the code works (there is a sample enzo dataset in the distribution), and feel free to ask questions on the dev list, on the user list, and on #yt, if you're into the whole IRC thing.
sincerely,
Jeff Oishi
On Wed, Jun 22, 2011 at 8:02 AM, Sam Geen
wrote: So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python.
I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools.
Sam
On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: OK, cool. So is there anything that I can do to help with the Ramses instantiation? I could just play about with implemeting Pymses with the current I/O interface, but if you're already doing this then I'll wait to see what you come up with. Also, if the I/O interface is changing soon then I might hold off.
There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
Actually, if there's any design documentation not in the code paper or not obviously on the website floating around then it might be useful to have a look at this. Also, another vaguely related query - how much can Python talk to compiled code (C++, etc)? To what extent can you pass Python objects to C++ routines, assuming you know what the public methods are?
Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
Sam
Matthew Turk wrote: > > Hi Sam, > > On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
> wrote: > >> Hi, >> >> This is in reply to Matt's e-mail from 3 weeks ago (I only just >> realised >> I >> forgot to hit "confirm" on the yt-dev mailing list signup). >> > No worries! :) > > >> I guess one solution to the problem would be to abstract what a >> "grid" >> is >> (I'm guessing a grid is a container for a geometrically consistent >> chunk >> of >> the entire simulation volume?) Then allow it to answer queries about >> its >> geometric properties itself. So for example, ask it >> "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is >> to >> figure out a flexible but simple interface for this, depending on >> how >> well >> you know the requirements for what the grid should be able to do. In >> general, I think this is the ideal situation, because as Matt says >> hammering >> every code into the same structure in memory creates slowdowns. One >> possibility is to create a few template memory structures, etc, to >> allow >> people to bolt together new implementations for each code. >> > A grid is indeed a container for a chunk of the simulation -- > typically in patch-based AMR codes, these will be some (hopefully > large but not too large) contiguous region. This enables numpy to > take over, as it helps batch mathematical operations -- for instance, > for an operation like: > > field1*field2 > > the startup cost of parsing, identifying the object as a buffer of > contiguous data, identifying the types, dispatching the correct > function, and then allocating and returning a new buffer is the > startup cost against which the actual operation of multiplication is > weighed. The batching of operations with grids nicely coincides with > reducing the ratio between startup to operation cost. > > Right now, the mechanism for geometric constructs is inverted from > how > you describe -- when describing a sphere, for instance, the operation > is: > > * Query the Hierarchy object (which I would support renaming to > 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) > to > identify grids that intersect the geometric region. This is > accomplished through a "geometry mixin" that supplies various > routines > to do this. > * Query each intersecting grid's x,y,z values (for each cell) to > identify which cells intersect the region. > * Return these values to the routine/user that requested them. > > I think that this is compatible with what you have outlined, in > general. The issue I had hoped to avoid was to reduce the > interaction > between IO and geometry as much as possible, simply because IO > routines are usually compiled code, whereas ideally I would like the > geometry to be performed in Python. (As it stands it's usually done > with operations on bounding values of grids to find intersections.) > > >> In terms of choosing algorithms for different types of fluid blob >> (e.g. >> one >> for particles, one for grids), this can be done using functionoids >> for >> the >> algorithms (or at least functionoid wrappers) and then a functionoid >> factory >> for spawning the correct functionoid to use with the container. >> You'd >> have >> to wrap all this up in a simple interface again, otherwise it'd be >> impossible to use. >> >> I also suggested to Matt to create a "fluid blob" iterator that >> works >> for >> all types of fluid blob (SPH particle, octree grid cell, voronoi >> tessellation cell) but this might be very slow in Python. That said, >> iterating over "grid"s as chunks of the amr grid instead is a >> possibility. >> Having some kind of iterator option might be good, though, as doing >> things >> like tracking particles through different snapshots is something >> I've >> been >> doing extensively in my (pre-YT) work. >> > A generalized fluid blob iterator would be interesting; I think into > this the grids could be placed. By extending the geometry mixin to > work with different methods, this could be feasible. I wonder if > perhaps rethinking the idea of static geometries (determined at > instantiation) would assist with addressing SPH data. I am inclined > to think this would be a way forward. In looking over the code, it's > not clear to me that there are many places that grids are assumed > except in the projections and the first-pass of data selection. > > (Projections as we do them now might port nicely to SPH, but it's not > yet clear to me.) > > >> I don't know how much of this is already known; my domain is Ramses, >> which >> is still very slow to use with my dataset (although Matthew has been >> very >> helpful in working on the Ramses side of things). I thus haven't >> looked >> too >> much at YT yet as it's still prohibitively slow to load my dataset >> and >> play >> with it. >> > I did manage to squeeze out what for me was an OOM improvement in > RAMSES data instantiation, but I confess it is still slow. And there > are other issues with it. Right now Casey and I are refactoring > fields, and I have set up a testing infrastructure, so I am feeling > bit more inclined to try more invasive changes after branching into > 2.3 and 3.0 branches sometime later this summer. > > Perhaps moving to a generalized geometry, into which the standard > patch/block AMR "hierarchy" paradigm would fit, would meet the > necessary needs to do generalized fluid operations... > > -Matt > > >> Cheers, >> >> Sam >> >> On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk >> wrote: >> >> Hi all, >> >> This is a portion of a conversation Sam Geen and I had off-list >> about >> where to make changes and how to insert abstractions to allow for >> generalized geometric reading of data; this would be useful for >> octree >> codes, particles codes, and non-rectilinear geometry. We decided to >> "replay" the conversation on the mailing list to allow people to >> contribute their ideas and thoughts. I spent a bit of time last >> night >> looking at the geometry usage in yt. >> >> Right now I see a few places this will need to be fixed: >> >> * Data sources operate on the idea that grids act as a >> pre-selection >> for cells. If we get the creation of grids -- without including any >> cell data inside them -- to be fast enough, this will not >> necessarily >> need to be changed. (i.e., apply a 'regridding' step of empty >> grids.) >> However, failing that, this will need to be abstracted into >> geometric >> selection. For cylindrical coordinates this will need to be >> abstracted anyway. The idea is that once you know which grids you >> want, you read them from disk, and then mask out the points that are >> not necessary. >> * The IO is currently set up -- in parallel -- to read in chunks. >> Usually in parallel patch-based simulations, multiple grid patches >> are >> stored in a single file on disk. So, these get chunked in IO to >> avoid >> too many fopen/seek/fclose operations (and the analogues in hdf5.) >> This will need to be rethought. Obviously, there are still some >> analogues; however, it's not clear how -- without the actual >> re-gridding operation -- to keep the geometry selection and the IO >> separate. I would prefer to try to do this as much as possible. I >> think it's do-able, but I don't yet have a good strategy for it. >> >> My current feeling now is that the re-gridding may be a slightly >> necessary evil *at the moment*, but only for guiding the point >> selection. It's currently been re-written to be based on hilbert >> curve locating, so each grid has a unique index in L-8 or something >> space. >> >> I believe that geometry and chunking of IO are the only issues at >> this >> time. One possibility would actually be to move away from the idea >> of >> grids and instead of 'hilbert chunks'. So these would be the items >> that would be selected, read from disk, and mapped. This might fit >> nicer with the Ramses method. >> >> What do you think? >> >> Best, >> >> Matt >> >> _______________________________________________ >> Yt-dev mailing list >> Yt-dev@lists.spacepope.org >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org >> >> >> > _______________________________________________ > Yt-dev mailing list > Yt-dev@lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org > _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Sam,
Yup, it doesn't yet have particle support. That is definitely
something that would be great to have. For particles we actually have
a finer-grained IO system set up, so that bounding boxes that do not
coicnide with grid edges can and should be fed in to the particle
reader. What I think might be really useful woul be both particle
counting and particle reading routines, so that we could do load
balancing operations as well as IO.
-Matt
On Wed, Jun 22, 2011 at 4:49 PM, Sam Geen
The times you're getting on the 9.3gig set sound encouraging, and it sounds like it's getting to the point where changes to the Ramses code that don't require any drastic changes are getting somewhere. I'll download the latest release once you're happy with the documentation, etc, and have a play. It's not drastically urgent to have a lightning-fast Ramses implementation, but I'd be interested in working with people to create one in the future.
Also: I'm guessing that Ramses still doesn't have particle support? This is something that's quite important in my research so I could look at that if it would be useful.
Sam ________________________________________ From: yt-dev-bounces@lists.spacepope.org [yt-dev-bounces@lists.spacepope.org] on behalf of Matthew Turk [matthewturk@gmail.com] Sent: 23 June 2011 00:35 To: yt-dev@lists.spacepope.org Subject: Re: [Yt-dev] Geometry, RAMSES and non-patch datasets
Hi Sam,
We definitely appreciate your enthusiasm and interest. I think that we have reached the point in growing the community, and the user base, that it is time to revisit some design decisions. Part of that is the idea of IO, of geometric selection, and of selecting fluid blobs -- grids, sections of the hilbert curve, particles and voronoi mesh items -- to operate on. I'm personally very excited about this, even if a bit hesitant (as I think is natural!) to be too invasive with the changes. Again, though, this is important, particularly as we grow as a community. The perspective you bring is definitely important, and I am interested in making sure that we can find a way to move forward together.
The results you show are very similar to what I was seeing. I have just run a test on an output I found, output_0045, the provenance of which I cannot recall. It's about 9.3 gigs. I ran profiling of dev versus stable tip:
http://yt.enzotools.org/ramses_prof/
The all-told time for dev was 47 seconds or so to regrid and construct the final grids in memory; the all-told time for stable was ~286 seconds. This isn't quite an OOM improvement, but it is something. Unfortunately it looks like the dev tip won't be put out as a "release" for a month and a half at least. That being said, I'm still not terribly happy with 47 seconds just to parse and regrid a mesh. Removing the hilbert indices call (which I think we could, by using what's in the files, but I did not know how to access that information) and rethinking the way get_array_indices_lists is used (which stores the IO information for individual cells) could improve the situation dramatically. For instance, if we were to remove the storage of cell information within a grid object, and instead recalculate on the fly individual cells inside the regridded objects, we could probably speed this up by a factor of two or more.
Anyway, I'm going to try to work on this a bit next week at least writing up how the code currently operates and adding comments as necessary. I will ping the list as necessary. But I do very much appreciate your ideas, and your experience, and I share your concerns about how it currently operates. Despite the improvements, I think it has a ways to go. I'll see what I can do about describing where I see its operation and its shortcomings before I go on vacation.
Best,
Matt
On Wed, Jun 22, 2011 at 3:56 PM, Sam Geen
wrote: Also, if anyone is curious, these are some profiler test results for one of our Ramses datasets (size 7.7GB), using the stable release YT code as of 6th June. If anyone is curious I can send you some more info (e.g. number of cells at each refinement level, etc).
Load (i.e. RAMSESStaticOutput): http://www-astro.physics.ox.ac.uk/~GeenS/testload/loadprof/ Density slice: http://www-astro.physics.ox.ac.uk/~GeenS/testload/densslice/
So my main issue is that the regridding (I think) is taking 30mins+, which needs an order of magnitude or two more speed-up before I can start using YT as a analysis framework full-time. If people think that this can be achieved with the current interface and without substantial changes beyond the Ramses frontend, then this would be great.
Sam
On 22/06/11 18:41, Sam Geen wrote:
Hi Jeff,
Thanks for your e-mail. One of the things I've been wanting to do with these e-mails is to determine what I can add to the project, rather than jumping in and adding something that either doesn't fit in with the structure of the rest of the code or that is replicating what someone else has already tried or implemented. Part of what I've said arose from discussions with Matthew about how best to optimise the Ramses frontend to remove a lot of the existing overheads, and a lot of what I've suggested is simply my way of figuring out my way around the problem.
I definitely agree that the best approach is to use the existing interface that YT has with its I/O frontends. The reason I was holding off from doing this is because Matthew suggested that he might be changing the way YT interfaces with its I/O routines, but I don't know how far away this would be. I am conscious that you guys have far more experience with these sorts of issues than I do, which is why I'm airing them here rather than trying them and finding out problems with them that someone else has already discovered.
The main issue I'm facing is how to deal with an octree-based code (Ramses) rather than a patch-based code (Enzo - at least this is my understanding of it), and I don't know to what extent these issues have been discussed or which ideas have been found to work/not work. If you have any suggestions for threads to follow from the yt-dev archive or similar design documents then that would be very much appreciated. Something like Flash is also octree-based, I guess, although it does have patches on some level (I guess an analogy is that Ramses has 2x2x2 patches rather than NxNxN).
Your suggestion that I play with the sample Enzo dataset is a good one, and sounds like a good way of becoming familiar with how YT operates. I'll try that before I jump in and start fiddling with the Ramses implementation. Like I said, I'm new to YT, so any and all suggestions are welcome. Equally, I'm keen to know what work already exists with the Ramses frontend, so that I don't tread on any toes or reinvent too many wheels. My main goal is to be able to analyse very large Ramses datasets efficiently, so I'm keen to work with the YT community to achieve this.
Cheers,
Sam
On 22/06/2011 18:06, j s oishi wrote:
Hi Sam,
I'm glad you're interested in helping to develop yt's Ramses facilities. yt is a community code, and we're always excited to embiggen that community. However, I think you might want to take a step back and look at what yt already is and does.
yt is highly optimized already for patch based AMR. We have thought through these concepts in significant detail, and over the past nearly 5 years we have refactored and optimized the loading of data many times. I don't know if this is your intention, but you seem to be suggesting that we have not thought about iterators and where the bottlenecks lie in our analysis to data pipeline. We did not find it necessary to jump to C often (we do not use C++ because of portability issues on some of the supercomputing centers we deploy on). Your emails seem to suggest you believe a rewrite of many core routines in C/C++ would be necessary. I think that the Ramses reader could be brought to speeds comparable to the enzo/orion readers by optimizing the approach already pursued by those yt frontends. Python has extremely powerful object orientation; the need for C++ templates is obviated by those features, and we have already demonstrated that a small amount of C can go a very long way for speed.
Again, I want to reiterate that we are excited that you are interested in contributing; I myself am working with a group using Ramses, and it would be great to have an optimized frontend. I just want to impress upon you that there is a lot of people in the yt community, and we have indeed thought many of these issues through. It might be helpful to take a look at how the code works (there is a sample enzo dataset in the distribution), and feel free to ask questions on the dev list, on the user list, and on #yt, if you're into the whole IRC thing.
sincerely,
Jeff Oishi
On Wed, Jun 22, 2011 at 8:02 AM, Sam Geen
wrote: So I guess a pattern for stepping through the iterators could go: - Step through the iterators in the normal Python way. - The next function checks to see whether we require extra resolution (somehow...) - If we do, it jumps to its children, if not it jumps to its neighbours. - Whether or not it needs to open a new file/grid is determined under-the-hood. If it's too slow to do this on a per-iterator basis, the overarching container class (whether that's the snapshot or the AMR grid or whatever) could jump through files/grids, which then give iterators to their own grid cells which follow the above pattern. I guess if this is slow then it could always be folded into C++, but I'd need to look more into Cython to see how easy it is to do, since the analysis algorithms will probably need to talk to the iterators. Like you say, numpy is quite fast at doing array operations, and I don't know if the iterator pattern fits into that, since iterators tend to operate individually. Maybe the C++ I/O could do this iterator pattern and then offer up arrays in memory to be operated on. Alternatively, the analysis algorithms could be chunked up into C++ routines, but I guess this makes it harder to write analysis routines. Then again, no reason YT can't have both, with analysis routines being refactored into C++ too. In general, iterators have been pretty successful as a way of coupling diverse container classes to diverse algorithms in C++, although I don't know how their performance scales in Python.
I'm happy to work in tandem on this. My timetable is that I'm moving into a new job around the end of July, so I might be a bit up-in-the-air during August, but can probably make time to look into this around then. I'm happy to do some C++ work, but my guess is that it's better to start with everything in Python and then figure out where the bottlenecks are with the profiling tools.
Sam
On 22/06/2011 15:44, Matthew Turk wrote:
Hi Sam,
On Wed, Jun 22, 2011 at 6:11 AM, Sam Geen
wrote: > > OK, cool. So is there anything that I can do to help with the Ramses > instantiation? I could just play about with implemeting Pymses with > the > current I/O interface, but if you're already doing this then I'll wait > to > see what you come up with. Also, if the I/O interface is changing soon > then > I might hold off. There's definitely stuff you can do to help out; if you want to give a shot at profiling we can figure out some of the places it might still be struggling. For instance, if you have a file that just includes:
import cProfile pf = load("output_00007/info_00007.txt") cProfile.run("pf.h.print_stats()", "output.cprof")
it should create an output.cprof that you can visualize either using the pstats module or the pyprof2html ("pip install pyprof2html", although you may have to also pip install jinja) to see where the bottlenecks are. My guess is that calculating the hilbert indices is still a big portion, as that's what it was for me.
> Actually, if there's any design documentation not in the code paper or > not > obviously on the website floating around then it might be useful to > have > a > look at this. Also, another vaguely related query - how much can > Python > talk > to compiled code (C++, etc)? To what extent can you pass Python > objects > to > C++ routines, assuming you know what the public methods are?
Right now unfortunately the only design documentation is either in the paper, or ad hoc in the docs. I'm happy to work through some of that here, though, and it might be a good idea to consolidate this information on the wiki, too.
As for C++, it's actually very straightforward; we are using Cython (not Boost::Python, which I am told is quite nice but unfortunately carries with it the Boost overhead) and you can see how this is done primarily in yt/frontends/ramses/_ramses_reader.pyx, which uses headers found in yt/frontends/ramses/_ramses_headers , courtesy of Oliver Hahn.
The more I think about it, the more viable I think it is to move all the grid selection routines into the geometry class (although I am not committing myself to moving fundamental routines *outside* of the objects they relate to, just yet) and then using them as the fluid blob selection routines. I have to think about how to do this without relying on bounding box arguments, but it should be viable. I would like to work on this together -- perhaps after I am back from vacation we can try to start refactoring the necessary classes? The testing infrastructure is coming into place rapidly, so it would be feasible to try to do this.
Thanks for your thoughts on this.
-Matt
> Sam > > Matthew Turk wrote: >> >> Hi Sam, >> >> On Tue, Jun 21, 2011 at 8:07 AM, Sam Geen
>> wrote: >> >>> Hi, >>> >>> This is in reply to Matt's e-mail from 3 weeks ago (I only just >>> realised >>> I >>> forgot to hit "confirm" on the yt-dev mailing list signup). >>> >> No worries! :) >> >> >>> I guess one solution to the problem would be to abstract what a >>> "grid" >>> is >>> (I'm guessing a grid is a container for a geometrically consistent >>> chunk >>> of >>> the entire simulation volume?) Then allow it to answer queries about >>> its >>> geometric properties itself. So for example, ask it >>> "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is >>> to >>> figure out a flexible but simple interface for this, depending on >>> how >>> well >>> you know the requirements for what the grid should be able to do. In >>> general, I think this is the ideal situation, because as Matt says >>> hammering >>> every code into the same structure in memory creates slowdowns. One >>> possibility is to create a few template memory structures, etc, to >>> allow >>> people to bolt together new implementations for each code. >>> >> A grid is indeed a container for a chunk of the simulation -- >> typically in patch-based AMR codes, these will be some (hopefully >> large but not too large) contiguous region. This enables numpy to >> take over, as it helps batch mathematical operations -- for instance, >> for an operation like: >> >> field1*field2 >> >> the startup cost of parsing, identifying the object as a buffer of >> contiguous data, identifying the types, dispatching the correct >> function, and then allocating and returning a new buffer is the >> startup cost against which the actual operation of multiplication is >> weighed. The batching of operations with grids nicely coincides with >> reducing the ratio between startup to operation cost. >> >> Right now, the mechanism for geometric constructs is inverted from >> how >> you describe -- when describing a sphere, for instance, the operation >> is: >> >> * Query the Hierarchy object (which I would support renaming to >> 'mesh' or 'geometry' in a future iteration of the code, likely 3.0) >> to >> identify grids that intersect the geometric region. This is >> accomplished through a "geometry mixin" that supplies various >> routines >> to do this. >> * Query each intersecting grid's x,y,z values (for each cell) to >> identify which cells intersect the region. >> * Return these values to the routine/user that requested them. >> >> I think that this is compatible with what you have outlined, in >> general. The issue I had hoped to avoid was to reduce the >> interaction >> between IO and geometry as much as possible, simply because IO >> routines are usually compiled code, whereas ideally I would like the >> geometry to be performed in Python. (As it stands it's usually done >> with operations on bounding values of grids to find intersections.) >> >> >>> In terms of choosing algorithms for different types of fluid blob >>> (e.g. >>> one >>> for particles, one for grids), this can be done using functionoids >>> for >>> the >>> algorithms (or at least functionoid wrappers) and then a functionoid >>> factory >>> for spawning the correct functionoid to use with the container. >>> You'd >>> have >>> to wrap all this up in a simple interface again, otherwise it'd be >>> impossible to use. >>> >>> I also suggested to Matt to create a "fluid blob" iterator that >>> works >>> for >>> all types of fluid blob (SPH particle, octree grid cell, voronoi >>> tessellation cell) but this might be very slow in Python. That said, >>> iterating over "grid"s as chunks of the amr grid instead is a >>> possibility. >>> Having some kind of iterator option might be good, though, as doing >>> things >>> like tracking particles through different snapshots is something >>> I've >>> been >>> doing extensively in my (pre-YT) work. >>> >> A generalized fluid blob iterator would be interesting; I think into >> this the grids could be placed. By extending the geometry mixin to >> work with different methods, this could be feasible. I wonder if >> perhaps rethinking the idea of static geometries (determined at >> instantiation) would assist with addressing SPH data. I am inclined >> to think this would be a way forward. In looking over the code, it's >> not clear to me that there are many places that grids are assumed >> except in the projections and the first-pass of data selection. >> >> (Projections as we do them now might port nicely to SPH, but it's not >> yet clear to me.) >> >> >>> I don't know how much of this is already known; my domain is Ramses, >>> which >>> is still very slow to use with my dataset (although Matthew has been >>> very >>> helpful in working on the Ramses side of things). I thus haven't >>> looked >>> too >>> much at YT yet as it's still prohibitively slow to load my dataset >>> and >>> play >>> with it. >>> >> I did manage to squeeze out what for me was an OOM improvement in >> RAMSES data instantiation, but I confess it is still slow. And there >> are other issues with it. Right now Casey and I are refactoring >> fields, and I have set up a testing infrastructure, so I am feeling >> bit more inclined to try more invasive changes after branching into >> 2.3 and 3.0 branches sometime later this summer. >> >> Perhaps moving to a generalized geometry, into which the standard >> patch/block AMR "hierarchy" paradigm would fit, would meet the >> necessary needs to do generalized fluid operations... >> >> -Matt >> >> >>> Cheers, >>> >>> Sam >>> >>> On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk >>> wrote: >>> >>> Hi all, >>> >>> This is a portion of a conversation Sam Geen and I had off-list >>> about >>> where to make changes and how to insert abstractions to allow for >>> generalized geometric reading of data; this would be useful for >>> octree >>> codes, particles codes, and non-rectilinear geometry. We decided to >>> "replay" the conversation on the mailing list to allow people to >>> contribute their ideas and thoughts. I spent a bit of time last >>> night >>> looking at the geometry usage in yt. >>> >>> Right now I see a few places this will need to be fixed: >>> >>> * Data sources operate on the idea that grids act as a >>> pre-selection >>> for cells. If we get the creation of grids -- without including any >>> cell data inside them -- to be fast enough, this will not >>> necessarily >>> need to be changed. (i.e., apply a 'regridding' step of empty >>> grids.) >>> However, failing that, this will need to be abstracted into >>> geometric >>> selection. For cylindrical coordinates this will need to be >>> abstracted anyway. The idea is that once you know which grids you >>> want, you read them from disk, and then mask out the points that are >>> not necessary. >>> * The IO is currently set up -- in parallel -- to read in chunks. >>> Usually in parallel patch-based simulations, multiple grid patches >>> are >>> stored in a single file on disk. So, these get chunked in IO to >>> avoid >>> too many fopen/seek/fclose operations (and the analogues in hdf5.) >>> This will need to be rethought. Obviously, there are still some >>> analogues; however, it's not clear how -- without the actual >>> re-gridding operation -- to keep the geometry selection and the IO >>> separate. I would prefer to try to do this as much as possible. I >>> think it's do-able, but I don't yet have a good strategy for it. >>> >>> My current feeling now is that the re-gridding may be a slightly >>> necessary evil *at the moment*, but only for guiding the point >>> selection. It's currently been re-written to be based on hilbert >>> curve locating, so each grid has a unique index in L-8 or something >>> space. >>> >>> I believe that geometry and chunking of IO are the only issues at >>> this >>> time. One possibility would actually be to move away from the idea >>> of >>> grids and instead of 'hilbert chunks'. So these would be the items >>> that would be selected, read from disk, and mapped. This might fit >>> nicer with the Ramses method. >>> >>> What do you think? >>> >>> Best, >>> >>> Matt >>> >>> _______________________________________________ >>> Yt-dev mailing list >>> Yt-dev@lists.spacepope.org >>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org >>> >>> >>> >> _______________________________________________ >> Yt-dev mailing list >> Yt-dev@lists.spacepope.org >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org >> > _______________________________________________ > Yt-dev mailing list > Yt-dev@lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org > _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (3)
-
j s oishi
-
Matthew Turk
-
Sam Geen