Hi G.S.,

I will try to answer a few of these points.

1. I would recommend simply using the clump finder as is without trying to alter the contour spacing.  If you're concerned that you will miss some clumps with spacing that is too large, just make the spacing smaller.  The clump finder will continue to look for clumps within a parent object at increasingly higher values for the minimum contour.  I don't see any benefit to re-engineering this.

2. One thing to keep in mind is that any single clump object is just a 3D data object, like a sphere or region.  What that means is that you have available to you all of the derived quantities that come with any 3D data object.  You should be able to easily implement a clumping factor derived quantity if one doesn't already exist.

4. See point #2.  You should be able to do clump.derived_quantities['TotalQuantity']('CellVolume'), or something along those lines.

5. I suspect it is not at all that simple, since a clump is a collection of cells that don't necessarily persist over any period of time, whereas a halo is made up of particles with unique ids and are there at all times.  There may still be a way to do what you want, but I don't think the existing merger tree machinery is it.

6. I recommend the pickling option, since that will return to you the exact same object you made, allowing you to do whatever you want with it.  Provided you can save your simulation data, I think this will give you the most flexibility to do more analysis after the fact.

7. You should be able to throw the clump object to the regular projection function, but as of right now, I don't believe you can give it to the volume renderer, although I think this may be coming in the future.


On Thu, Sep 2, 2010 at 10:57 PM, <gso@physics.ucsd.edu> wrote:
Hi YT users, over the past week I've spoken to a couple YT developers and
got some great feedback and suggestions, but I just want to outline my
plans/strategies/troubles to see if I can get more help from the

I'm currently trying to use YT to analyze data on ionized clumps of gas
from the FLD Radiation runs of Enzo I made.

My plan is to (if possible):
1) Create a hierarchy of clumps, based on their level of ionization
instead of topology like in the cookbook find_clumps()
2) Calculate a global clumping factor from regions inside these clumps
3) Find the location(x,y,z) of the peaks of HI_Density and HI/H (ionized
fraction) and the value of Density in those regions
4) Find the volume inside each clump region
5) From the hierarchy of clumps, create their merger history
6) Save the clump information in such a way that I can come back to it if
I find something else that's interesting to analyze about them
7) Volume render the clumps separately and/or together on the same picture

my strategy and the troubles I am running into for the corresponding
points are:

1) I want to avoid re-inventing the wheel and use the current machinery
inside YT to create the hierarchy of clumps.  I think I can use the
Clump() to find a master clump like in the cookbook, but instead of
calling on find_clump(), call find_children().  Inside find_children, I
can supply it a different minimum value for the amount of ionization I
want, because they may not be equally spaced or equal multiples of each
other.  So maybe for level 0 I can do find_children( min[0], max), then
level 1 do find_children(min[1], max) etc.  The problems I'm running into

- I haven't played around with this enough to see if it'll work for just 1
level down from the master clump
- If I don't use find_clumps then I'll need another way to make this
process recursive, it might be as simple as copying what's done in
find_clumps() without the checking if it is valid, but I am not sure.

2) I think an logical way of doing this is to calculate the local clumping
factor for each clumps individually and get the global one from the
ionizing clumps by doing a volume weighted average.

- Should it be weighted by mass or something else?

3) I can get the index of the peak region by something like
for child in master_clump.children:
 HI_peak_index = na.argmax(child["HI_Density"])
 HI_peak_x = child["x"][HI_peak_index]
finding the Density value would be
 Density_at_Peak_HI = child["Density"][HI_peak_index]

- Don't think I'll run into trouble here.

4) I know the regular Clump's write_info() writes the number of cells from
set_default_clump_info(), but can it return the physical volume maybe in
mpc or co-moving mpc?

- It would be easy for unigrid simulations that I do now, but if there's a
more general way of getting this information for the AMR case I'd like to
know.  Right now I can just take the cell count and multiply it by the
volume per cell... harder for the AMR.

5) I know Stephen was able to do create a Halo merger tree, I think it was
done with the SQLite database or he displayed the tree from a SQlite
database.  I was wondering if something similar can be done for ionized
clumps.  I think Matt mentioned it was doable with just using the YT
object without the need of the database.  There seems to be more attention
at sqlite lately, saw an email from Irina couple days ago, but not sure if
that issue was resolved.  I myself didn't encounter that problem at all I
just had to "import sqlite3" and everything worked, on my mac OSX 10.6.4,
Kraken, and Triton.

my question:
- Is it more straight forward one way or the other? (w/ or w/o database)
- I'm having a hard time coming up with an unique identifier that keeps
track of a clump.  Do I define it having the peak at a certain cell?  Dave
suggested this, but I've considered this previously and thought I'd get
into trouble if the peak moved from one data output to another, maybe he
has a workaround.  Or do I identify a clump by containing a certain star
particle?  Because the star particles are all uniquely defined with a
number, so this way there is no ambiguity.  But this is assuming that the
star particle will not move beyond the specific clump, which may or may
not be an valid assumption for all levels of ionization thoughout the
entire simulation.  Or is there a much simpler build in way that python
identify each object that I don't know about?

6) There's the write_info() for the Clump(), but I don't think that is
adequate for what I need.  Dave and Matt suggested cPickle() where I save
the location of the object, which I can later access if I have the data at
its original location.  An alternative is to save the data I want in a
database as mentioned before.

pros of cPickle:  little data is duplicated, everything is in python
cons of cPickle:  original simulation data has to be available, when
scratch disk fills up, big simulation data are usually the first to go.

pros of database:  can do a lot of type of retrieval that's already
pre-programmed, can access the clump data that's saved even if the big
simulation is on archive only.
cons of database:  Do not have any information about data that's not
previously saved, and duplicate some data redundantly.

7) I see that for slices of data, I can do a callback of .clump() to plot
the contour, but I was wondering if it's just as simple to plot clumps in
volume rendering.  Maybe sometimes have contour of different ionization on
the same picture, sometimes only the specific ionization I want.

I apologize for the email being kind of long and wordy, but any
help/suggestions on any of the points is appreicated, thanks :-)


yt-users mailing list