Hi Chris, (as a quick note, I've pushed some changes in 9f8fb27e2fc1 that I will reference here.)
I've been working on getting writing code (http://hg.enzotools.org/gadget_infrastructure/summary) that will transform gadget snapshots into the newly-minted gridded data format (GDF: http://yt.enzotools.org/wiki/GridDataFormat). The last few days I've been writing a very skeletal GDF reader in yt, and I've come across a few problems. I pushed changes to yt a few hours ago, and so should be up on the hg repository - what I'm talking about is referring to files in /frontends/gadget/* .
This is awesome! Congratulations on this, it's a huge step, and would be an amazing asset to the community. I am also delighted you're using the GDF!
1. When reading, say, an array of particle data like position_x of some grid, (shape is just N, the number of particles in that grid) I get array shape errors. It tries to multiply the array by a weight_data but that's an array of the same shape as active dimensions for the grid (which in my case is [2,2,2] since I'm doing an octree refinement) - so it cries when it tries to multiply an N array with a (3,2)-shape array. To make that shape make sense, it seems like I want not the data for that particular grid, but all of it's children's data. Or something, I'm a bit confused. Check out io.py.
Ah! Yup, I see what's up here -- it's actually because projecting particles isn't necessary, unless you're projecting a deposited particle. I've also changed it so that the particle_type has been set to true on the relevant fields, to avoid some of these issues.
2. Is there going to be a problem with having parent grids with no particle data - that are almost empty? The position_x array for such a grid won't exist, so I'm not sure what to do in the _read_data_set() method. I get grids like this when I subdivide a grid into another 8 child grids and then all of the particles particles belonging to the parent grid are shuffled into the appropriate child bins. The parent is then left with none.
For the particles, this shouldn't be an issue. But if you have any fluid quantities, they have to be present in all the grids. Particles are treated independently, and it shouldn't evne try to read grids that have no particles.
3. What does _read_data_slice() mean when a field like 'position' is already scalar-ified as position_x? I thought the slice 'axis' would've picked out the 'x' axis in position, but you start off already asking for position_x.
This should also be okay for particles. :) For fluid fields, it will be defined.
I've included an example GDF data file here (http://dl.dropbox.com/u/206140/decay_100.gyt.hdf5 - not guaranteed to be alive forever) and a pastebin script (http://paste.enzotools.org/show/1145/) that highlights my first problem.
Quick question -- are you sure this has been correctly created? I looked at the grid_left_index, and it's all 0's and 1's. I should have been more explicit in the documentation for GDF, but what this should be is the global index of the grid's starting position; i.e., for a level 0 grid, if it started in the upper left corner, this would be 0,0,0. But for a level one grid that occupies the bottom right octant, this would be 2,2,2... Essentially, this is calculated by (left_edge - domain.left_edge) / dx, and for a level it can take on values of 0 .. (refine_by^level * top_grid_dimensions). Again, thanks for your hard work. I've almost been able to project a deposited particle field! This is exciting. -Matt