Mailman 3 June 2011 - yt-dev

not_in_all
by david collins 27 Jun '11

27 Jun '11

Hi-- I have (a probably stupid) problem. I have a field that I'm writing out to some grids. The field is called 'AvgElec0', and only exists on level>0 grids (non-root-grids). I've defined this field def _AEx(field,data): return data['AvgElec0'][:,:-1,:-1] add_field('AEx',function=_AEx,validators=[ValidateSpatial(0)],take_log=False,not_in_all=True) (the slice is for the centering of the field). When I do something like pf.h.grids[1]['AEx'] I get a key error, "AvgElec0," even though double checking the field is in fact in that grid. If I change the code so it's written on all levels, the same pf.h.grids[1]['AEx'] works fine, as one would expect. Has the not_in_all behavior changed? Might I be doing something stupid? Thanks, d. -- Sent from my computer.

2 6

Re: [Yt-dev] Geometry, RAMSES and non-patch datasets
by Sam Geen 22 Jun '11

22 Jun '11

Hi, This is in reply to Matt's e-mail from 3 weeks ago (I only just realised I forgot to hit "confirm" on the yt-dev mailing list signup). I guess one solution to the problem would be to abstract what a "grid" is (I'm guessing a grid is a container for a geometrically consistent chunk of the entire simulation volume?) Then allow it to answer queries about its geometric properties itself. So for example, ask it "myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to figure out a flexible but simple interface for this, depending on how well you know the requirements for what the grid should be able to do. In general, I think this is the ideal situation, because as Matt says hammering every code into the same structure in memory creates slowdowns. One possibility is to create a few template memory structures, etc, to allow people to bolt together new implementations for each code. In terms of choosing algorithms for different types of fluid blob (e.g. one for particles, one for grids), this can be done using functionoids for the algorithms (or at least functionoid wrappers) and then a functionoid factory for spawning the correct functionoid to use with the container. You'd have to wrap all this up in a simple interface again, otherwise it'd be impossible to use. I also suggested to Matt to create a "fluid blob" iterator that works for all types of fluid blob (SPH particle, octree grid cell, voronoi tessellation cell) but this might be very slow in Python. That said, iterating over "grid"s as chunks of the amr grid instead is a possibility. Having some kind of iterator option might be good, though, as doing things like tracking particles through different snapshots is something I've been doing extensively in my (pre-YT) work. I don't know how much of this is already known; my domain is Ramses, which is still very slow to use with my dataset (although Matthew has been very helpful in working on the Ramses side of things). I thus haven't looked too much at YT yet as it's still prohibitively slow to load my dataset and play with it. Cheers, Sam On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk <matthewturk(a)gmail.com <mailto:matthewturk@gmail.com>> wrote: Hi all, This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt. Right now I see a few places this will need to be fixed: * Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it. My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space. I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method. What do you think? Best, Matt

3 10

2.2 release, new website, vacation
by Matthew Turk 22 Jun '11

22 Jun '11

Hi all, I'm writing with three points. 1) We had hoped for a release by the end of June. There are three outstanding tickets [ http://hg.enzotools.org/yt/issues?status=new&status=open&milestone=2.2 ]. * Document reason * Fix units for Nyx * Fix field of view for reason slice/proj widget I'm not going to be able to look at these before the end of June. The middle one might get pushed off, as it's blocked on some other developments. So anybody that wants to step up and handle either the field of view or the reason documentation, please do so. Anybody have any thoughts on this? 2) A few of us have created a prototype of a new website: http://yt.enzotools.org/prototype/ to replace the old one. The style was purchased from themeforest.net. I am not yet convinced that this is a good idea, to move to a fancier layout, because I worry that too fancy and it looks unserious. I can't tell if this site crosses that line. I do like how it lays out the community, development, how to get it, and examples. (The examples, I think, are particularly helpful.) +1/-1? The idea was to dump this at the end of the month when 2.2 came out, but if 2.2 is delayed, then this will not happen. Also, I'd be happy to grant write access to the page repo for this; it's private right now because I don't know how the copyright from themeforest works with publicly-viewable repositories, although in theory it's not any additional information that couldn't be gotten from the packages delivered when viewing the page. The images I selected are ... biased towards ones I already had. :) So that might be a place to improve it, among other things ... 3) I am going on vacation starting sometime between June 29 and July 3, and I will be gone until at least July 23rd. During that time I will have very, very limited access to email, and I do not anticipate being able to reply to any yt-users emails or make any software changes. I will be able to reply, in a very limited fashion and with high latency, to direct email inquiries. Thanks, Matt

6 12

A Mission Statement for yt
by Matthew Turk 19 Jun '11

19 Jun '11

Hi everyone, I hope you'll take the opportunity to read and respond to this email, even if you're not a heavy-developer, or even a heavy-user, of yt. Your feedback and contributions would be greatly, greatly appreciated, particularly as this will help guide where yt development, community-building and (optimistically) use will go. I know that sometimes the signal-to-noise on the yt lists can be a bit low, but I think this is a particularly useful discussion to have. A few of us have been brainstorming, in person, in IRC, etc about the direction yt has been going. There are a number of reasons for doing this -- to provide focus, to provide an idea of the off-in-the-distance goal, and to have a public statement of what we're about, which shows ambition, concern for the values that go into a scientific code, and an interest in providing access to that code. This boils down to coming up with a mission statement, which will help both focus our goals on what we want to provide, as well as describe those areas we do not want to provide. Much of this is based on the contents of “The Art of Community” by Jono Bacon, specifically around page 71 in the PDF available on www.artofcommunityonline.org/get/ . “Mission statements are intended to be consistent and should rarely change, even if the tasks that achieve that mission change regularly. When building your mission statement, always have its longevity in mind. Remember, your mission statement is your slam-dunking, audacious goal. For many communities these missions can take decades or even longer to achieve. Their purpose is to not only describe the finish line, but to help the community stay on track.” To develop a mission statement, which will act as a precursor to a strategic plan, we need to construct answers to three questions. These will provide the initial basis for a broader mission statement. For reference, here are some “principles” we came up with several years ago: http://yt.enzotools.org/principles.html As I mentioned above, a few of us have been spitballing answers to these questions, and it has reached the point where we really need to bring this forward, to conduct these discussions in public, to bring some clarity and engagement to the process. Ultimately, once we have sketched out a couple broad goals and bullet points, this can then be distilled into a short, pithy block of text that serves as a "Mission Statement." Below are some potential bullet points, but I feel strongly that it's important that these get refined and discussed. = What is the mission? = * To create a fun, community-led, open source tool for asking and answering astrophysical questions through simulations, analysis and visualization * To create reproducible, cross-code questions and answers from astrophysical data * To construct a consistent language for asking questions of simulation data from many sources * To encourage researchers to participate in constructing a community code = What are the opportunities and areas of collaboration? = * Development of new tools, new techniques, and adding support for new codes. * Adding components to the GUI * Providing outreach-capable frontends * Improving visualization qualities * Adding new methods of accessing data * Performance analysis & optimization * Deployment to new platforms * Designing new web pages * Writing documentation and recipes * Spreading the word * Support for Cartesian non-astrophysical simulations (weather, earthquakes) * Extension to non-Cartesian coordinate systems * Mentoring new developers = What are the skills required? = * Thoughtful process * Careful quality control * Ability to communicate * An investment in “the answer” * Eagerness to participate in an open fashion What other bullets, ideas, inclinations do people have? If we can start a discussion, maybe we can draft some text. This would certainly help with focusing our strategies for presenting yt to others, directing our development in conjunction with our scientific goals, and collaborating as a community. Thanks very much for any thoughts, Matt

5 7

Re: [Yt-dev] A Mission Statement for yt
by chris.m.malone＠gmail.com 13 Jun '11

13 Jun '11

Hi Matt, Maybe this goes a bit beyond yt, but one of the main feelings I get from yt is a strong urge to promote open science in general. I think this might fall under the "areas of collaboration", but if researchers can share their simulation/analysis/visualization scripts and/or data then there is a feeling of open and reproducible research. I think yt has started down this path with public storage and git repos that can be openly branched and shared, etc. One area where this sort of thing might be out of the scope of yt is the microphysics used in simulations - there are very few, to my knowledge, public repos of conductivity, EOS, opacity, etc. with which to promote open science. Just my $0.02 Chris On Jun 13, 2011 12:43pm, Matthew Turk <matthewturk(a)gmail.com> wrote: > Hi everyone, > I hope you'll take the opportunity to read and respond to this email, > even if you're not a heavy-developer, or even a heavy-user, of yt. > Your feedback and contributions would be greatly, greatly appreciated, > particularly as this will help guide where yt development, > community-building and (optimistically) use will go. I know that > sometimes the signal-to-noise on the yt lists can be a bit low, but I > think this is a particularly useful discussion to have. > A few of us have been brainstorming, in person, in IRC, etc about the > direction yt has been going. There are a number of reasons for doing > this -- to provide focus, to provide an idea of the > off-in-the-distance goal, and to have a public statement of what we're > about, which shows ambition, concern for the values that go into a > scientific code, and an interest in providing access to that code. > This boils down to coming up with a mission statement, which will help > both focus our goals on what we want to provide, as well as describe > those areas we do not want to provide. Much of this is based on the > contents of “The Art of Community” by Jono Bacon, specifically around > page 71 in the PDF available on www.artofcommunityonline.org/get/ . > “Mission statements are intended to be consistent and should rarely > change, even if the tasks that achieve that mission change regularly. > When building your mission statement, always have its longevity in > mind. Remember, your mission statement is your slam-dunking, audacious > goal. For many communities these missions can take decades or even > longer to achieve. Their purpose is to not only describe the finish > line, but to help the community stay on track.” > To develop a mission statement, which will act as a precursor to a > strategic plan, we need to construct answers to three questions. > These will provide the initial basis for a broader mission statement. > For reference, here are some “principles” we came up with several > years ago: > http://yt.enzotools.org/principles.html > As I mentioned above, a few of us have been spitballing answers to > these questions, and it has reached the point where we really need to > bring this forward, to conduct these discussions in public, to bring > some clarity and engagement to the process. Ultimately, once we have > sketched out a couple broad goals and bullet points, this can then be > distilled into a short, pithy block of text that serves as a "Mission > Statement." Below are some potential bullet points, but I feel > strongly that it's important that these get refined and discussed. > = What is the mission? = > * To create a fun, community-led, open source tool for asking and > answering astrophysical questions through simulations, analysis and > visualization > * To create reproducible, cross-code questions and answers from > astrophysical data > * To construct a consistent language for asking questions of > simulation data from many sources > * To encourage researchers to participate in constructing a community code > = What are the opportunities and areas of collaboration? = > * Development of new tools, new techniques, and adding support for new > codes. > * Adding components to the GUI > * Providing outreach-capable frontends > * Improving visualization qualities > * Adding new methods of accessing data > * Performance analysis & optimization > * Deployment to new platforms > * Designing new web pages > * Writing documentation and recipes > * Spreading the word > * Support for Cartesian non-astrophysical simulations (weather, > earthquakes) > * Extension to non-Cartesian coordinate systems > * Mentoring new developers > = What are the skills required? = > * Thoughtful process > * Careful quality control > * Ability to communicate > * An investment in “the answer” > * Eagerness to participate in an open fashion > What other bullets, ideas, inclinations do people have? If we can > start a discussion, maybe we can draft some text. This would > certainly help with focusing our strategies for presenting yt to > others, directing our development in conjunction with our scientific > goals, and collaborating as a community. > Thanks very much for any thoughts, > Matt > _______________________________________________ > Yt-dev mailing list > Yt-dev(a)lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

2 1

Geometry, RAMSES and non-patch datasets
by Matthew Turk 07 Jun '11

07 Jun '11

Hi all, This is a portion of a conversation Sam Geen and I had off-list about where to make changes and how to insert abstractions to allow for generalized geometric reading of data; this would be useful for octree codes, particles codes, and non-rectilinear geometry. We decided to "replay" the conversation on the mailing list to allow people to contribute their ideas and thoughts. I spent a bit of time last night looking at the geometry usage in yt. Right now I see a few places this will need to be fixed: * Data sources operate on the idea that grids act as a pre-selection for cells. If we get the creation of grids -- without including any cell data inside them -- to be fast enough, this will not necessarily need to be changed. (i.e., apply a 'regridding' step of empty grids.) However, failing that, this will need to be abstracted into geometric selection. For cylindrical coordinates this will need to be abstracted anyway. The idea is that once you know which grids you want, you read them from disk, and then mask out the points that are not necessary. * The IO is currently set up -- in parallel -- to read in chunks. Usually in parallel patch-based simulations, multiple grid patches are stored in a single file on disk. So, these get chunked in IO to avoid too many fopen/seek/fclose operations (and the analogues in hdf5.) This will need to be rethought. Obviously, there are still some analogues; however, it's not clear how -- without the actual re-gridding operation -- to keep the geometry selection and the IO separate. I would prefer to try to do this as much as possible. I think it's do-able, but I don't yet have a good strategy for it. My current feeling now is that the re-gridding may be a slightly necessary evil *at the moment*, but only for guiding the point selection. It's currently been re-written to be based on hilbert curve locating, so each grid has a unique index in L-8 or something space. I believe that geometry and chunking of IO are the only issues at this time. One possibility would actually be to move away from the idea of grids and instead of 'hilbert chunks'. So these would be the items that would be selected, read from disk, and mapped. This might fit nicer with the Ramses method. What do you think? Best, Matt

1 0

Ramses cell counting and hierarchy
by Matthew Turk 07 Jun '11

07 Jun '11

Hi Oliver and others, I've spent a bit of time working on the Ramses reader. In the test cases I have, it's now going ~2.6x faster than it was this morning. But it still takes a long time. Right now the top routines are, all in the ramses frontend directory: _count_grids recursive_patch_splitting fill_hierarchy_arrays count_zones The last one, I had some questions about. It counts up the number of unique zones on each level, which is actually seeming to take a lot longer than I think it ought need to. I do this by iterating over all the RAMSES_tree objects, then over each level in the tree, then over each cell and incrementing a counter if cell.get_domain() == idomain. The code looks something like this, but keep in mind this is Cython: cdef np.ndarray[np.int64_t, ndim=1] cell_count cell_count = np.zeros(self.rsnap.m_header.levelmax + 1, 'int64') cdef int local_count = 0 for idomain in range(1, self.rsnap.m_header.ncpu + 1): local_tree = new RAMSES_tree(deref(self.rsnap), idomain, self.rsnap.m_header.levelmax, 0) local_tree.read() for ilevel in range(local_tree.m_maxlevel + 1): local_count = 0 local_level = &local_tree.m_AMR_levels[ilevel] grid_it = local_tree.begin(ilevel) grid_end = local_tree.end(ilevel) while grid_it != grid_end: local_count += (grid_it.get_domain() == idomain) grid_it.next() cell_count[ilevel] += local_count del local_tree Ultimately, this takes up a LOT of time. I was wondering if there was a simpler way of figuring out, simply, how many cells are local to a given domain and a given level. All I need back is an array that includes the *total* unique cells on every level, summed across domains. Any ideas? (Also, I've gotten a 2.6 speedup; hoping for more later today. Then after the regridding is fast, I will attack IO.) Thanks, Matt

2 2

QuadTree projection now works in parallel
by Matthew Turk 06 Jun '11

06 Jun '11

Hi everyone, I've just pushed some changes to the quad tree projection that should parallelize it automatically. The old-style of projecting requires parallelization through a spatial decomposition in the 2D plane of the image. This results in two problems -- very poor load balancing in the current scheme and the inability to utilize this operation in situ, as it requires passing data around in a manner different from the simulation code's load balancing scheme. Furthermore, it can be slow. About a year ago I implemented a quadtree projection mechanism that was about an order of magnitude faster for big datasets. Unfortunately, because of the more complicated nature of the datastructures, I never parallelized it. This last week I figured out how to do this, and then implemented this parallelization in the quad_proj object in yt. I've tested it and it gives very good results for both memory and speed; it's about an order of magnitude faster than the old-style projection for my datasets, and I have been unable to get it to scale since the time-to-completion is so low. It would be great if some other people could test it, to see how well it scales for them. It should perform best in parallel where the spatial-decomposition will give poor results -- this is often with deeply nested hierarchies, or with refinement regions that do not cover the entire box. Additionally, if you are interested in testing it in situ, this is a good idea as well. To use it, you can simply do: pf.h.proj = pf.h.quad_proj and do the normal PlotCollection, lightcone, etc etc operations, or you can manually create quad_proj objects: qp = pf.h.quad_proj(0, "Density") (for instance) and examine those and the time for those. I would like to replace the old-style projection with this for the 2.2 release, if we can go back and forth and make sure it is up to snuff, so your testing is GREATLY appreciated to avoid any hiccups along the way. Thanks, Matt

4 5

Bitbucket Ttcket conversion
by Matthew Turk 02 Jun '11

02 Jun '11

Hi all, I've created a test repository with the tickets from the old Trac installation imported. Here are the issues: https://bitbucket.org/MatthewTurk/yt-ticket-conversion/issues You can browse around, see the components, etc, and filter as well. This is all the issues. The conversion is a bit lossy, in that old comments and whatnot had to be included in the main content of the issue. And I believe most of the InterTrac links have been broken. But otherwise I think I'm mostly happy with it. Can I get +1/-1 on the current conversion process? I'll migrate the tickets once we're settled on how to convert them. After tickets have been migrated, I will probably close down Trac. I'd also like to start encouraging people to report bugs on BB, possibly even through a "yt bugreport" command. It might also be a good idea to have new ticket reports echoed to yt-dev. Thoughts on these? Best, Matt

3 3

Re: [Yt-dev] QuadTree projection now works in parallel
by Matthew Turk 02 Jun '11

02 Jun '11

Hi Stephen, Yes, this has been a long standing thing that's bugged me. We could possibly change it in a 3.0, but I think I would rather simply move past the PlotCollection as the preferred interface, and leave its warts as is. Does that make sense? -Matt On Jun 2, 2011 12:03 PM, "Stephen Skory" <s(a)skory.us> wrote:

2 1