Hi Matt and Stephen,
I was able to fork YT (hg clone https://firstname.lastname@example.org/gsiisg/el-yt)
and implemented the
in data_containers.py, and right now I'm trying to put in the different
methods of finding the ellipsoid parameters from the halo parameters. I'm
going to put in the geometry method first, and that requires the halo's
-center of mass x,y,z
-particle positions x,y,z
-grid edges left and right (for checking periodicity)
>From Matt's previous email:
"I think how this would work the best would be to make two separate
methods on the Halo base class."
I was wondering where to put the addition to Halo base class, should it be
in the halo_objects.py, or make a new file?
I'm still need to figure out (and this is where help would save me a ton of
time), how to access the two files inside a class in halo_objects.py. Both
of which are created by
One of the file (.out) has the center of mass, and the other (.h5) has the
particle positions. In my script they were either in memory already when I
ran the halo finder parallelHF, or loaded into memory by the single function
LoadHaloes(). I did a h5ls on the .h5 files and don't see the center of
mass information, so I assume it opened both the .out and .h5 and loaded
them into memory.
And last but not least, I need to make another "dump" mechanism, or add to
the existing one so it output the ellipsoid parameters. Yes I did take
Britton's advice and cut out the e2xyz, e3xyz and use a single tilt angle to
replace them, which should cut out 5 columns of extra information. The
ellipsoid object will take *center_of_mass, A,B,C,*e1vector,tilt_angle as
input, the ones with * means the input is a numpy array.
I wanted to let you all know (Matt especially) that I have had no
problems installing and using yt on Lion using the standard install
script. Bootstrapping works fine, too.
As an upgrade over Snow Leopard, I'm not sure it's worth it yet. I've
actually undone a few of the Lion "improvements", like scroll
direction, the hidden user Library directory, and scroll bars, to name
a few. But since it came on my shiny new laptop (SSDs rock, by the
way), I saw no reason to downgrade from it.
510.621.3687 (google voice)
This email turned out to be a bit long, but I would really appreciate
comments and suggestions. I need some help with thinking through how
to restructure parallelism, and some help with writing tests that
The parallelism in yt has gotten a bit out of control. It is largely
governed by inheriting from ParallelAnalysisInterface, which provides
a bunch of methods that assume MPI_COMM_WORLD and that do things
(mostly) to arrays. I went in today to make some changes and found
myself a bit overwhelmed by the number of methods, the number of
leftover chunks of code, and the overall length of the
ParallelAnalysisInterface. In fact, I'm a bit concerned simply
because we use PAI as a mixin in so many diverse situations -- this
leads to a pretty unweildy "dir()" call on those classes that use it
as a MixIn.
There are 61 methods defined, which I have appended to the end of this
email. More worrisome, though, is that it seems when a method is
needed, rather than examining the existing methods, a new one is
simply added. And, perhaps most challenging, it is not 100% clear how
to test all of these methods in isolation, or where the cascade-effect
will cause problems if a given method is changed. I'm guilty of all
of these things, and the bulk of the problem to begin with. But what
is clear is that it's becoming challenging if not impossible to make
changes, because it's not clear how fragile different methods are or
how they might behave. And, as we move toward much bigger platforms
and co-scheduled and in-situ visualization, we will *need* to be more
flexible with parallelism; we'll need better and easier ways of
dispatching tasks, keeping processors busy, and sending/receiving
To that end, I'd like to propose we change the behavior -- and mindset
-- somewhat. I'm willing to accept that objects that are distributed
and in parallel should inherit from PAI (although I wonder if an
explicit 'communication' object might be more appropriate) but I would
like to propose that we make things a bit smarter. Right now we
largely interface with MPI as thought we are writing a language with
no introspection and with no higher-level constructs. I'd like to see
us develop something of a higher-level, more straightforward interface
As a first step, I'd like to see all of the functions named "_mpi_*"
be removed. Rather than prefixing with _mpi, then the operation, then
the type of the array (for instance), we should instead provide a
single method (_parallel_operation, for instance) that accepts
operands, opcodes, arguments, etc, and then dispatches to the
appropriate sub-function. One opportunity that this will bring up is
that the dispatch itself can determine the appropriate communicator to
use -- which would open up the ability to subdivide our parallelism.
The implicit next step after this is to avoid feature creep in the
However, before we can make *any* set of changes, we need a set of
tests. Right now there are a *large* number of operations that are in
parallel. There are tests for these operations in *serial*, but not
parallel. In yt/tests I have added a new setup for tests based on the
enzo_test mechanism, which can operate in parallel. It covers a lot
of functionality, but notably no halo finding and no volume rendering.
I did not think I was qualified to test these items. The tests run
no more than once an hour, with every new changeset. Additional tests
of parallel-analysis can be added as well.
The (tl;dr) two takeaways:
1) Would an opcode / function dispatch mechanism be an acceptable
replacement for a humongous proliferation of functions?
2) Is anyone willing to write tests in the framework, so that we can
make these kind of changes to the code base? I would be willing to
lead a sprint on this. I think it would be a lot easier than, say,
testing Enzo -- you just need to write up some functions that run
routines, and call it good. But there are so many twiddles that I
don't know that I could do it myself.
Hi YT developers,
I'm sure some of may have two, or even many installations of YT. I was
wondering what's the best way to keep track of which version of YT you're
using and making sure that the libraries are linked correctly.
I was thinking using a soft link and so the python installation I'll call
But then the libraries they'll be linking against would both be specified
in my PATH and LD_LIBRARY, I was wondering if that would cause havoc when
I try to update something and screw things up? Or am I worrying too much?