Hello, I have some questions regarding the two point function in yt-2.6. 1) If my volume is a generic data on a rectangular grid, would the random point pairs be on a regular grid, or would they be truly random on a spherical grid an interpolated/averaged from the nearest neighborhoods on a rectangular grid? 2) If the former is true in 1), does it meant that you have to pick total_values that will not exceed the number of point pairs within your distance? -Piyanat
Hi Piyanat, If memory serves (and it might not!), 1. The first point in each pair is chosen randomly within the volume. The second point is chosen by picking a random-ish distance away (within limits) from the first point, and the 3D angle is randomly chosen on a sphere (so all angles are equally likely). 2. I'm not sure I understand this question. The algorithm should not chose point pairs with distances greater than the size of your volume (periodicity considered). Can you rephrase your question? Sorry I can't be more specific, but yt isn't part of my job anymore! :( On Mon, Sep 15, 2014 at 3:02 PM, Piyanat Kittiwisit <piyanat.kittiwisit@asu.edu> wrote:
Hello,
I have some questions regarding the two point function in yt-2.6.
1) If my volume is a generic data on a rectangular grid, would the random point pairs be on a regular grid, or would they be truly random on a spherical grid an interpolated/averaged from the nearest neighborhoods on a rectangular grid?
2) If the former is true in 1), does it meant that you have to pick total_values that will not exceed the number of point pairs within your distance?
-Piyanat
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice)
Hi Stephen, Rephrasing my questions: If the pixel coordinates of my data volume is on a 3D rectangular grid (assuming pixel coordinate here), i.e. 0 0 0 0 0 1 0 0 2 ... 0 1 0 0 1 1 0 1 2 ... Would the random points be chosen out of these coordinates or from arbitrary coordinates, i.e. (0.2, 3.1, 5.8) instead of (0, 3, 6)? Now, if the random point coordinates are arbitrary, this implies an infinite number of points within a spherical volume for a given max separation. How does the TPF avoid overlapping voxels here, i.e. does it put a constraint on the lower limit of separation to avoid overlapping voxels? If the random point coordinates are not arbitrary but chosen from the fixed rectangular coordinates, this implies a limited number of points for a given maximum separation, so how does total_values work here? Do I have to pre-estimate the number of pairs within a sphere of a given max separation and set it total_values to this number to avoid over counting the same pairs? I am not sure if my use here make sense. I have a temperature field in a big 3D numpy array that I would like to calculate an absolute difference in temperature between pixel pairs, so I load the data into the yt and run it with the tpf function. It seems to work although I have not gotten mpirun to work with the script., and I am not sure if this is actually the right thing to do. Any comment would be appreciated. Thanks! -Piyanat On Sep 15, 2014, at 3:29 PM, Stephen Skory <s@skory.us> wrote:
Hi Piyanat,
If memory serves (and it might not!),
1. The first point in each pair is chosen randomly within the volume. The second point is chosen by picking a random-ish distance away (within limits) from the first point, and the 3D angle is randomly chosen on a sphere (so all angles are equally likely).
2. I'm not sure I understand this question. The algorithm should not chose point pairs with distances greater than the size of your volume (periodicity considered). Can you rephrase your question?
Sorry I can't be more specific, but yt isn't part of my job anymore! :(
On Mon, Sep 15, 2014 at 3:02 PM, Piyanat Kittiwisit <piyanat.kittiwisit@asu.edu> wrote:
Hello,
I have some questions regarding the two point function in yt-2.6.
1) If my volume is a generic data on a rectangular grid, would the random point pairs be on a regular grid, or would they be truly random on a spherical grid an interpolated/averaged from the nearest neighborhoods on a rectangular grid?
2) If the former is true in 1), does it meant that you have to pick total_values that will not exceed the number of point pairs within your distance?
-Piyanat
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
Hi Piyanat,
Would the random points be chosen out of these coordinates or from arbitrary coordinates, i.e. (0.2, 3.1, 5.8) instead of (0, 3, 6)?
The random points are random floats, but the field values used correspond to the cell in which they fall.
Now, if the random point coordinates are arbitrary, this implies an infinite number of points within a spherical volume for a given max separation. How does the TPF avoid overlapping voxels here, i.e. does it put a constraint on the lower limit of separation to avoid overlapping voxels?
I think that it checks to make sure that the smallest separation is bigger than the smallest cells. This could present an issue if you have an AMR dataset, so you'd have to account for that if this is the case.
If the random point coordinates are not arbitrary but chosen from the fixed rectangular coordinates, this implies a limited number of points for a given maximum separation, so how does total_values work here? Do I have to pre-estimate the number of pairs within a sphere of a given max separation and set it total_values to this number to avoid over counting the same pairs?
My thought is that if total_values is greater than N^2 where MxMxM=N and M is the side of your grid, you will end up repeating pairs. I do not think that any special care is taken for that case, so you may have to think about that.
I am not sure if my use here make sense. I have a temperature field in a big 3D numpy array that I would like to calculate an absolute difference in temperature between pixel pairs, so I load the data into the yt and run it with the tpf function. It seems to work although I have not gotten mpirun to work with the script., and I am not sure if this is actually the right thing to do.
If your array is not too big, it may be easiest to do this with a simple double loop. And if your data is all numpy, you might look into numba: http://numba.pydata.org/ for some speedups. -- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice)
Hi Stephen, Thank you for all the comments. I appreciate it. I have tried numba before. It speed up the distance and the two point function calculation quite a bit, but I cannot think of a way to not hold those values for binning later. Our array can be very big >1000**3 pixels, making it impossible to hold the distances and tpf values from all pairs. Subdividing and the array and using queue should work, but that is exactly what the TPF in yt is doing. Do you have a better idea than this on this issue? Btw, I also have an issue with parallelizing the TPF, which I am sending out in another email. Thanks! -Piyanat Piyanat Kittiwisit Ph.D. Candidate, Astrophysics Arizona State University piyanat.kittiwisit@asu.edu On Sep 15, 2014, at 7:06 PM, Stephen Skory <s@skory.us> wrote:
Hi Piyanat,
Would the random points be chosen out of these coordinates or from arbitrary coordinates, i.e. (0.2, 3.1, 5.8) instead of (0, 3, 6)?
The random points are random floats, but the field values used correspond to the cell in which they fall.
Now, if the random point coordinates are arbitrary, this implies an infinite number of points within a spherical volume for a given max separation. How does the TPF avoid overlapping voxels here, i.e. does it put a constraint on the lower limit of separation to avoid overlapping voxels?
I think that it checks to make sure that the smallest separation is bigger than the smallest cells. This could present an issue if you have an AMR dataset, so you'd have to account for that if this is the case.
If the random point coordinates are not arbitrary but chosen from the fixed rectangular coordinates, this implies a limited number of points for a given maximum separation, so how does total_values work here? Do I have to pre-estimate the number of pairs within a sphere of a given max separation and set it total_values to this number to avoid over counting the same pairs?
My thought is that if total_values is greater than N^2 where MxMxM=N and M is the side of your grid, you will end up repeating pairs. I do not think that any special care is taken for that case, so you may have to think about that.
I am not sure if my use here make sense. I have a temperature field in a big 3D numpy array that I would like to calculate an absolute difference in temperature between pixel pairs, so I load the data into the yt and run it with the tpf function. It seems to work although I have not gotten mpirun to work with the script., and I am not sure if this is actually the right thing to do.
If your array is not too big, it may be easiest to do this with a simple double loop. And if your data is all numpy, you might look into numba: http://numba.pydata.org/ for some speedups.
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
participants (2)
-
Piyanat Kittiwisit -
Stephen Skory