Mailman 3 October 2011 - yt-dev

pHOP in YT question
by Geoffrey So 31 Oct '11

31 Oct '11

Hi this question is mainly for Stephen, I was working on using the ellipsoid branch of YT to do some analysis on the 3200 cube haloes found. I came across the same problem you described earlier where calculating for tB_index I get nan sometimes, but this is happening even after your fix. I dug through and found out that this happens because pHOP found some halo that have only 1 particle. This becomes problematic with my geometric ellipsoid, it requires at least 4 particles distributed in different planes. When the ellipsoid method is trying to find a particle to satisfy the equation of an ellipse in 2D, it will go through the list (of 1 element in this case), and find that it returns nan for the length it calculates. So even though the function nanargmax() is used to pick out biggest value of non-nan element index, with a list of only 1 nan, it still gets selected and the method fails. Currently my temporary solution is to tell the function that, if it encounters a halo with less than 4 particles, to returns all 0s for the ellipsoid attribute (A,B,C,e1x,y,z,tilt). I was under the impression from our previous talk that there's a lowest number threshold that pHOP requires to form a halo, I thought the number was 10 particles at least. Am I just remembering things wrong? It has been a couple of years... From G.S.

3 5

boolean data container
by Stephen Skory 29 Oct '11

29 Oct '11

Hi All, yesterday I had a brain wave that it should be fairly simple to create a new kind of data container that is a hybrid of existing ones, where the mixing/cutting is done with boolean logic. I think I'm most of the way there. Let me give you some examples of what this means. >>> re1 = pf.h.region([0.5]*3, [0.4]*3, [0.6]*3) >>> re2 = pf.h.region([0.5]*3, [0.5]*3, [0.6]*3) >>> sp1 = pf.h.sphere([0.5]*3, 0.1) >>> sp2 = pf.h.sphere([0.1]*3, 0.1) This will give you the exact same region as re1: >>> re = pf.h.boolean([re1, "OR'", re2]) This will give you the same as re2: >>> re = pf.h.boolean([re1, "AND", re2]) This will give you the part of re1 that re2 does not cover, so re1 with a corner taken out: >>> re = pf.h.boolean([re1, "NOT", re2]) Likewise, this will give you a box with a sphere cut out of it in the middle: >>> re = pf.h.boolean([re1, "NOT", sp1]) This will combine the two disjoint spheres into one object: >>> re = pf.h.boolean([sp1, "OR", sp2]) Nested logic should work using parentheses, where items in parentheses are turned into new sub-hybrid regions before they are considered with the others at a 'higher' level. This will give you a disjoint region, with sp2 plus re2 with a corner cut out of it. >>> re = pf.h.boolean([sp2, "OR", "(", re1, "NOT", re2, ")"]) This is all being done with cut_masks so it should work fairly seamlessly. And so far, nearly everything seems to work. Except for some reason that I don't fully grasp right now, I'm changing the cut_masks of the starting regions, which is wrong. Here's an example of what I mean. If I make a hybrid region: In [1]: re1 = pf.h.region([0.5]*3, [0.4]*3, [0.6]*3) In [2]: re2 = pf.h.region([0.5]*3, [0.5]*3, [0.6]*3) In [3]: re = pf.h.boolean([re1, "AND", re2]) In [4]: re1['x'].shape yt : [INFO ] 2011-10-29 10:00:25,002 Getting field x from 4 Out[4]: (2197,) I'm seeing that re1 is being modified. Here's what I should see for the shape of ['x']: In [1]: re1 = pf.h.region([0.5]*3, [0.4]*3, [0.6]*3) In [2]: re1['x'].shape yt : [INFO ] 2011-10-29 10:14:04,730 Getting field x from 4 Out[2]: (17576,) It's being changed to the dimensions of re2, which is what re is when boolean is done. I'm wondering if I could get a second pair of eyes on this, someone who understands the subtleties of data_containers better than me. I've pushed the current state of things to my branch. https://bitbucket.org/sskory/yt/changeset/148e930ce903 Thanks to anyone who can spare the time, or for any comments on this idea! Let me know if you have any questions. -- Stephen Skory s(a)skory.us http://stephenskory.com/ 510.621.3687 (google voice)

1 0

Comments in parameter files
by david collins 26 Oct '11

26 Oct '11

Hi, all-- I've notice that trailing comments are not longer ignored when parsing parameter files. I have a parameter in many of my parameter files that documents itself, as in WriteBoundary = 0 //1=already written, 2=start writing which recently started causing the parameter file to parse. Is this an easy thing to fix in yt, or was this change done intentionally for a broader reason, and I should write a sed script to pull this out? I wouldn't bother asking if it weren't in an enormous number of files. Thanks, d. -- Sent from my computer.

2 3

Column Density derived field
by Stephen Skory 25 Oct '11

25 Oct '11

Hi all, I think I have the column density thingy working. I've got slices, projections and volume renders working (http://vimeo.com/31035296). I have a few questions. - Where in yt should I put this? Keeping the file in my work directory I use it like this, below. yt/utilities? Better ideas? from ColumnDensity import * cdnumdens = ColumnDensity(pf, 'NumberDensity', [0.5, 0.5, 0.5]) def _CDNumberDensity(field, data, cd = cdnumdens): return cd._build_derived_field(data) add_field('CDNumberDensity', _CDNumberDensity) - The function _build_derived_field, used directly above, is slightly odd. What happens in it is I get the cell positions from data (data['x'], etc..) and put those into the stuff that interpolates the column density. I get a field based on these positions, and that's what gets returned. However, and this shows what I don't understand, for projections (as in not slices) one of the times this function gets called as part of the add_field machinations data is a FieldDetector, and calling data['x'] on them is not what I want. So I've added an "if" that basically returns na.ones when data is a FieldDetector. What would be a better option than that? The source as it stands is pasted below. http://paste.enzotools.org/show/1892/ Thanks for your thoughts! -- Stephen Skory s(a)skory.us http://stephenskory.com/ 510.621.3687 (google voice)

4 10

Overloading EnzoStaticOutput causing problems with load
by david collins 24 Oct '11

24 Oct '11

Hi, Everybody-- In order to get units to be converted properly for fields that do not have units information in the parameter file, I've been subclassing EnzoStaticOutput. However, for some reason this shorts out the load function. For instance merely subclassing EnzoStaticOutput as follows causes load to return None, instead of a parameter file. This changed some time ago, but I've been ignoring it. Is there a way I can continue subclassing the EnzoStaticOutput, so I have more control over my units and not lose the load function? Thanks, d. from yt.mods import * class FileStaticOutputFourPi(EnzoStaticOutput): def __init__(self,*args,**kwargs): fpi = na.sqrt(4*na.pi) EnzoStaticOutput.__init__(self,*args,conversion_override={'Density': 1, 'DivB': 1, 'x-velocity': 1, 'Total_Energy': 1}, parameter_override={'LengthUnits': 1,'LengthUnit': 1, 'Time': 1}, **kwargs) >>> pf = load(path) >>> print pf None -- Sent from my computer.

2 2

yt Testing & Parallelism
by Sam Skillman 24 Oct '11

24 Oct '11

Hi all, A few of us worked this past week on a couple yt projects and made what we think is significant progress. Two of the items we focused on were testing and parallelism. For testing, we've broadened the test suite to include many more functions and derived quantities. We now have 548 tests that include (off and on-axis) slices, (off and on-axis) projections, phase distributions, halo finding, volume rendering, and geometrical region cuts such as rectangular solids, spheres, and disks. We use both plain and derived fields for these tests so that it covers as many bases as possible. With this framework, we are now able to keep a gold standard of the test results for any dataset, then test later changes against this standard. These tests can test for bitwise identicality or allow for some tolerance. For a full list of tests, you can run python yt/tests/runall.py -l, and use --help to look at the usage. We will soon be updating the documentation to provide more information on how to set up the testing framework, but I think all of us agree that this will make it much easier to test our changes to make sure bugs have not crept in. The second big change I'd like to talk about is the way we now handle parallelism in yt. Previously, methods that employed parallelism through MPI calls would first inherit from ParallelAnalysisInterface, which had access to a ton of mpi functions that all work off of MPI.COMM_WORLD. In our revamp we wanted to accomplish two things: 1) merge duplicate mpi calls that were only different by the type of values they work on and do overall cleanup. 2) Allow for nested levels of parallelism where two (or more) separate communicators are able to use barriers and collective operations such as allreduce. To do this, we worked in a two-step process. First we took things like: def _mpi_allsum(self, data): def _mpi_Allsum_double(self, data): def _mpi_Allsum_long(self, data): def _mpi_allmax(self, data): def _mpi_allmin(self, data): and packed it into a single function: def mpi_allreduce(self, data, dtype=None, op='sum'): When a numpy array is passed to this new mpi_allreduce, dtype is determined from the array properties. If the data is a dictionary, then it is passed to mpi4py's allreduce function that acts on dictionaries. This greatly reduced the number of lines in parallel_analysis_interface (1376 ==> 915), even after adding in additional functionality. The second step was bundling all of these functions into a new class called Communicator. This Communicator object is initialized with an MPI communicator that no longer is restricted to COMM_WORLD. Using this as the fundamental MPI object, we then built a CommunicationSystem object that manages these communicators. A global communication_system instance is created, that is initialized with COMM_WORLD at the top of the system if the environment is mpi4py-capable. If not, an empty communicator is created that has passthroughs for all the mpi functions. Using this new framework we are now able to take advantage of multiple communicators. There are two use cases that we have implemented so far: 1) parallel_objects parallel_objects is a method un parallel_analysis_interface.py for iterating over a set of objects such that a group of processors work on each object. This could be used, for example, to run N projections each with M processors, allowing for a parallelism of NxM. 2) workgroups workgoups allows users to set up multiple MPI communicators with a non-uniform number of processors to each work on a separate task. This capability lives within the ProcessorPool and Workgroup objects in parallel_analysis_interface.py These are just the first two that we tried out and we are very excited about the new possibilities. With these changes, there was one implementation change that has already come up once in the mailing list. When you implement a new class that you'd like to have access to the communication objects, you must first inherit ParallelAnalysisInterface, and then make sure that __init__ makes a call to: ParallelAnalysisInterface.__init__() At that point, your new class will have access to the mpi calls through the self.comm object. For example, to perform a reduction one would do: self.comm.mpi_allreduce(my_data, op='sum') As I said before, this will all be documented soon, but hopefully this will help for now. Sam, Britton, Cameron, Jeff, and Matt

1 0

new comm problem
by Stephen Skory 21 Oct '11

21 Oct '11

Hi all, I'm finding a problem with the new communicator stuff in ParallelAnalysisInterface in r322fe24bb0cb. Specifically, .comm is being initialized to None. With a short script like this: ~~~~~~~~~~~~ from yt.mods import * from yt.utilities.parallel_tools.parallel_analysis_interface import \ ParallelAnalysisInterface class humbug(ParallelAnalysisInterface): def __init__(self): pass bah = humbug() print bah.comm ~~~~~~~~~~~~ I get "None", which is not correct, I think. I've found that if I modify things to look like this below in parallel_analysis_interface.py, it works. But I don't think that's how things are supposed to work. Could someone who understands what's supposed to happen take a look at this? I suppose _grids and _distributed would also suffer from the same problem. Thanks! ~~~~~~~~~~~~ class ParallelAnalysisInterface(object): comm = communication_system.communicators[-1] _grids = None _distributed = None def __init__(self): self.comm = communication_system.communicators[-1] self._grids = self.comm._grids self._distributed = self.comm._distributed ~~~~~~~~~~~~~~ -- Stephen Skory s(a)skory.us http://stephenskory.com/ 510.621.3687 (google voice)

3 3

quick question on particle IO
by Geoffrey So 19 Oct '11

19 Oct '11

Hi YT devs, I was asked by the sys admin at NICS about a part of YT that I'm not very familiar with: "Is the particle IO in YT that calls h5py spawned by multiple processors or is it doing it serially?" We're trying to solve a memory issue we're seeing in the parallelHF, and Stephen has been making modifications that I'm currently trying out, but we wanted to see if we can identify the source of the problem. I told the admin that parallel projection works with each processor reading in only a portion of the field variables, but I couldn't be sure about the particle data, so I wanted to verify with people on the dev list. From G.S.

3 10

Enzo 2.1 Public Release
by Christine Simpson 18 Oct '11

18 Oct '11

Enzo 2.1 Oct. 17, 2011 We are proud to announce the public release of Enzo version 2.1. New with this release is an extensive answer testing facility, additional physics capabilities, and AMR performance enhancements. New physics capabilities include isotropic and anisotropic heat conduction, X-ray ionization/heating, massive black hole particles, 4th-order accurate gravity solver, distributed stellar feedback, shock tracking, and improved molecular hydrogen chemistry. Performance improvements include Hilbert curve dynamic load balancing, faster ray tracing, and ray merging. We have improved Enzo’s inline analysis using the yt project, as well as user documentation for the entire Enzo codebase. We have also simplified the generation of multi-mesh cosmological initial conditions. In addition to code upgrades, we have upgraded and reorganized our online repositories and documentation. The Enzo Project’s main website is now http://enzo-project.org/ . Links and information can be found here for both developers and general users. Our stable releases will continue to live on our Google code website: http://code.google.com/p/enzo/ . Users can clone a mercurial repository of our stable release code or download a tarball of the code (http://code.google.com/p/enzo/downloads/list ). A simple quick-start guide can also be found here: http://code.google.com/p/enzo/wiki/EnzoBootCamp . We will now support versioned documentation for all stable releases since Enzo 2.0 in addition to the development code on enzo-project.org. The documentation for Enzo 2.1 is located here: http://enzo-project.org/docs/2.1/. Development will now move to our new Bitbucket repository where all users are encouraged to make contributions to the code by forking the main repository, located at https://bitbucket.org/enzo/enzo-dev . An explanation of new developer guidelines can be found here: http://enzo-project.org/docs/2.1/developer_guide/index.html . The current release of Enzo 2.1 features contributions, bug fixes and enhancements from an additional ten individuals (over the course of 2700 changesets) since Enzo 2.0, and through our new development infrastructure we are expanding our ability to solicit and review contributions. Main website: http://enzo-project.org/ Mailing List: https://groups.google.com/forum/#!forum/enzo-users Stable repository: http://code.google.com/p/enzo/ Tarball of Enzo 2.1.0: http://code.google.com/p/enzo/downloads/list Enzo Boot Camp (a quick-start guide): http://code.google.com/p/enzo/wiki/EnzoBootCamp Enzo 2.1.0 documentation: http://enzo-project.org/docs/2.1/ Developer’s repository: https://bitbucket.org/enzo/enzo-dev/overview Developer’s guidlines: http://enzo-project.org/docs/2.1/developer_guide/index.html The Enzo Development Team

1 0

Whoops!
by Matthew Turk 15 Oct '11

15 Oct '11

Hi all, I screwed up and pushed a bunch of changes to the main yt repository that I didn't intend to. These include the following: * Experimental changes to database handling. These were in the branch 'yt'. Through a no-op merge I removed these from causing problems. I hope that some day these changes will become part of the main repository, but they definitely are not ready yet. * Experimental changes in the named branch deliberate_fields * Experimental changes in the named branch geometry_handling. I apologize for this, but as long as you don't update to these named branches, it shouldn't be too much of a problem. Hopefully they'll all three get merged in eventually. -Matt

1 0