Hi Matt et al.,
I removed most of the convo, as it got a bit long. I hope it's still clear what the different parts are about :)
I agree, it definitely would. Do you think you'd be at all interested in trying this out from the binary reader I wrote into yt? It might be possible to get a flow of data to OWLS-style format.
Yes, I think that would be worth my time for a bit. One issue could be is that not even close to all metadata for those simulations is stored in the binary output (that is at least true for the Oppenheimer&Dave simulations, I am not sure about others). Also, i bet different groups have stored different (numbers) of arrays and may have dealt with that in a non-standardized way.
Maybe the main pro of having one routine that converts data into a useful format is for others to copy and adapt to their own output standards… Writing an HDF5 file with all metadata included also makes it super-easy to check whether all has gone right in the data conversion.
like AREPO and PHURBAS.
My understanding is that in fact the tessellation structure is not stored, but I think we can use a similar (albeit reduced, since it needs < N_sph neighbors) technique to get the local density estimate for AREPO data. In fact, I think that might end up being easier and faster than the SPH data.
Do you, or anyone in this community have access to AREPO? It seems that code is still fairly non-public. I am sure some of us could get (possibly me included) for some specific science case, if you ask Volker or Lars. I in fact don't quite know what they store. They could store particle-like data, or grid-like data or both. If they store particle-like data, I'm sure they output all necessary hydro quantities to estimate the density everywhere in space, also where there is no particle (which in a Voronoi tesellation of space would just be the density of the nearest voronoi cell center, i.e. particle, right?). So that always needs the knowledge of zero neighbors for particles, or 1 for points in space other than particle positions.
Yup, DM and stars only. For the variation in N_sph I think we can actually have considerable flexibility in how we define both the smoothing kernel and the free parameters for it.
If we want to be exact in our determination of hydro quantities, we will always have to do the same kernel, and the same kernel length as used in the code. Differences won't be huge if we use other kernel prescriptions or numbers of neighbors, but they will be non-zero, which might be undesirabke for quantitative analysis.
The minimum length setting can also provide a bit of assistance for choosing the neighbors which will be examined for particles.
The minimum length ensures that sometimes many _more_ than N_sph neighbors need to be taken into account. What I seem to rememeber is that in the centers of massive halos, up to ~150 particles fall within the kernel of a given particle (whlie N_sph=64 was used).
The biggest stumbling block for me, conceptually, is the best way to search 26 (or more) oct neighbors efficiently, without storing the particle data in memory and minimizing the hits to the disk. I'm exploring a few options now, but what I think it's going to come down to is going to require some caching of particle data and clever ordering of octs in the tree.
I'm afraid I won't be of much help here, would need a crash course in oct-tree construction+ being smart first. Chris will likely have some ideas?
Right now my thought is we can do N_sph * 4 or so in each, and then inside each oct assume a grid size of say between 8 and 32 cells on a side, depending. I know I said earlier I wanted to avoid gridding, but I think as a first pass we may need to, after reading comments from both you and Nathan. (See below.)
N_sph*4 should in general be enough. Would there be a way to check if this is OK during runtime? I mean to say: what if a situation arises where it is not quite enough, would there be a simple way to check, that is simple and cheap enough to just do every time?
It's not obvious to me either the best way to do that. But I think that the solutions it will draw on will require much of the same infrastructure as the gridding solution, since we will ultimately need to evaluate the smoothing kernel at many different locations anyway. So I am going to proceed with a gridding step, with the admission that we hope to get rid of this in the future. And unlike in the 2.x series, I don't think we will face as much technical resistance to that.
Sounds good to me.
Okay. Good point. This is ultimately the point that convinced me that we're not (yet) ready for gridless analysis, in that geometric selection simply works better when you can easily evaluate a value at every point. So for now, I am going to go down the gridding path.
Once I have an outline of how to do this (i.e., how to go from the Octree -- which we have -- to a set of fluid particles) I'll write back either here or in a YTEP to describe the process, request feedback, and set about implementing.
Yay! Unfortunately, the week of the yt developers thing in SC, I am already occupied elsewhere… I do have funds myself nowadays, so any future meeting like that (or smaller/shorter…) should be no problem!
- Be able to operate over multiple pfs / merge them. How should the particle octree interact with the hydro one when we have both in a single snapshot/instant?
Maybe they should not be created independently…?
Hm, this is also possible. As it stands for RAMSES I was not creating particle octrees at all yet.
I think, when a grid cell somewhere wants to know the total mass contained in it, or something like that, the grids should be the same, right? So creating the octree for the hydro and collisionless components seems to make some sense to me. I am not sure if that wouldn't give issues in regions where the density of one type of particles is much larger than that of another type (again, oct tree n00b here)…
Thank you all for your input! I am generally quite excited about this, and I appreciate the thoughts and guidance.
Excitement seems contagious. Awesome work!