Hello Stephen, This is part 2 of the ellipsoid parameter finding. The method is outlined mainly in the Dubinski and Carlberg 1991 paper. But since we are doing this in YT and we already know the particle list to begin with, I put in the option that the user can specify if they want all the particles in the region to be taken into account or only the particles found by the halo finder algorithm. This is an option that I think should be amended to the first ellipsoid parameter finder. The script form is linked here: http://paste.enzotools.org/show/1782/ Once I have written out the methodology for myself, I'll work on the YT documentation, will probably need some help in this area when it comes to it. Doron Lemez has used this method in his paper and helped me tested the accuracy of this script. I was wondering do I need to ask him if he wants his name in the credits, or just mention his name as a special thanks? One thing he did in the paper that was especially useful is defined the starting sphere to have 2x the virial radius. Without that specification, I run into non-convergence sometimes, where after many iterations I end up with no particles. And I want to mention again that there's no guarantee that any or all the haloes will converge with this method. I tried to adhere to the PEP8 formatting, but old habits die hard so I'm sure I made mistakes in the formatting, let me know if you see them. From G.S.
Hi Geoffrey, Great to hear from you again on this project! I have several comments about the code interspersed; they are all relatively minor, but there are several. On Mon, Sep 5, 2011 at 10:23 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Hello Stephen, This is part 2 of the ellipsoid parameter finding. The method is outlined mainly in the Dubinski and Carlberg 1991 paper. But since we are doing this in YT and we already know the particle list to begin with, I put in the option that the user can specify if they want all the particles in the region to be taken into account or only the particles found by the halo finder algorithm. This is an option that I think should be amended to the first ellipsoid parameter finder.
Is this the parameter halo_only? I would like to note that creation_time < 0 is not reliable, and will make your code Enzo-specific. I recommend for now using the boolean selector IsStar: dm_index = dd["IsStar"] Additionally, I would recommend not constructing a new array (which will be slower than allowing the array selections to be left as is) and to instead set ax (which should be renamed, as 'ax' is idiomatically used in yt to reference an axis) to a list. I would go a bit further and suggest you not use the parameter halo_only as it is implicit in the halo finding. If the halo finding subselected on dark matter, then you would want that in the ellipsoid. If it didn't subselect, then you probably also don't want to subselect in the ellipsoid finding.
The script form is linked here: http://paste.enzotools.org/show/1782/ Once I have written out the methodology for myself, I'll work on the YT documentation, will probably need some help in this area when it comes to it.
I have a few comments: 1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. 2) Iterate over the halos: for halo in halos: 3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. 4) cm => com 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. 6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify: In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877 7) Line 113, use range or xrange, and use singular "axis" not "axes" 8) Line 114, you double the memory by making a new array again. 9) Line 117, abs is a numpy function. 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. 12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. 15) Avoid using where! 16) use .size instead of len() for arrays 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it. One fun thing would also be to plot projections of "Ones" using these ellipsoids as sources. That would help us see them better, too!
Doron Lemez has used this method in his paper and helped me tested the accuracy of this script. I was wondering do I need to ask him if he wants his name in the credits, or just mention his name as a special thanks?
Which credits do you mean? The CREDITS file is reserved for code contributions. Feel free in your documentation to mention that he assisted you, if you feel it is warranted. Please be sure that your documentation includes the ".. sectionauthor" directive to indicate that you wrote it. Please issue a pull request once you have included this routine into the halo finding module; we can directly review it then. Additionally, for the documentation modifications, I would encourage you to fork yt-doc and make your changes in source/analysis_modules/ under the halo finding section. Include a few (small file size) images, too. Thanks, GS! -Matt
One thing he did in the paper that was especially useful is defined the starting sphere to have 2x the virial radius. Without that specification, I run into non-convergence sometimes, where after many iterations I end up with no particles. And I want to mention again that there's no guarantee that any or all the haloes will converge with this method. I tried to adhere to the PEP8 formatting, but old habits die hard so I'm sure I made mistakes in the formatting, let me know if you see them. From G.S. _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hello Stephen, This is part 2 of the ellipsoid parameter finding. The method is outlined mainly in the Dubinski and Carlberg 1991 paper. But since we are doing
Matt, Stephen, Thanks Matt for all the suggestions. I've made most of the changes, but some I wasn't quite sure what you meant: 1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. - Done 2) Iterate over the halos: - Done (I assume this is faster because it doesn't try to find which halo every time I call it?) 3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. - Done, switched to calling halo_num to avoid ambiguity 4) cm => com - Done 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. - Switched to using list [], not sure why an array of arrays is better than list of arrays, but I take your word for it 6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify: In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877 - Not quite sure was meant by the example 7) Line 113, use range or xrange, and use singular "axis" not "axes" - Done 8) Line 114, you double the memory by making a new array again. - Done, switched to list 9) Line 117, abs is a numpy function. - Done, switched to na.abs() 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. - Old code left from my using copy and paste from the testing phase. Most things if used only once I avoid using memory by calling it a name 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. - Not changed, since this is the fastest way I can think of. Since I realized the memory requirement of particles is much less than fields, memory should not be the main concern here but speed. But I'm all ears if you or Stephen have a faster/efficient way. 12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. - Not done 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. - Done 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. - Done 15) Avoid using where! - Done 16) use .size instead of len() for arrays - Done 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it. - Will look into how to write an error class in YT..., very useful I think but want to see if the changes I have so far is adequate. updated script: http://paste.enzotools.org/show/1799/ So the ones I have trouble with are #6, 10, 11, all involving the phrase "in place". I had to wiki the phrase because I'm not familiar with it, and I think you mean "input is overwritten by output" as per http://en.wikipedia.org/wiki/In-place_algorithm When I solve for the 3 different cases of boundary condition, I had to double (1/3 axes * 3 cases = 1 extra) the particle position memory, because I only check 1 axis at a time, so it's not as bad as if I do all 3 axes and 3 cases at once. I can paste the code bit of "cases" into the next line where it is used twice by na.choose(). That should make the code "in place" but it would be a lot less readable. And since "cases" is used twice, having it in memory also makes things easier. I just wanted to clarify why it is left the way it is, but if you still think the in place method is better, I can make the change easily. Another point which I didn't understand was the difference between using array of array compare to list of arrays. From what I read http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use people seem to prefer list over array, and use arrays only when the same data type is absolutely require. Another point is that arrays are more memory efficient, because lists uses extra byte incase we want to append things to it etc, but if we want to do math to it or modify it, it is better use lists. In my case I multiply the rotation matrix to the xys positions, so I should use list because of doing math. Is that the correct understanding? From G.S. ---------- Forwarded message ---------- From: Matthew Turk <matthewturk@gmail.com> Date: Tue, Sep 6, 2011 at 4:11 AM Subject: Re: [Yt-dev] ellipsoid part 2 To: yt-dev@lists.spacepope.org Hi Geoffrey, Great to hear from you again on this project! I have several comments about the code interspersed; they are all relatively minor, but there are several. On Mon, Sep 5, 2011 at 10:23 PM, Geoffrey So <gsiisg@gmail.com> wrote: this
in YT and we already know the particle list to begin with, I put in the option that the user can specify if they want all the particles in the region to be taken into account or only the particles found by the halo finder algorithm. This is an option that I think should be amended to the first ellipsoid parameter finder.
Is this the parameter halo_only? I would like to note that creation_time < 0 is not reliable, and will make your code Enzo-specific. I recommend for now using the boolean selector IsStar: dm_index = dd["IsStar"] Additionally, I would recommend not constructing a new array (which will be slower than allowing the array selections to be left as is) and to instead set ax (which should be renamed, as 'ax' is idiomatically used in yt to reference an axis) to a list. I would go a bit further and suggest you not use the parameter halo_only as it is implicit in the halo finding. If the halo finding subselected on dark matter, then you would want that in the ellipsoid. If it didn't subselect, then you probably also don't want to subselect in the ellipsoid finding.
The script form is linked here: http://paste.enzotools.org/show/1782/ Once I have written out the methodology for myself, I'll work on the YT documentation, will probably need some help in this area when it comes to it.
I have a few comments: 1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. 2) Iterate over the halos: for halo in halos: 3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. 4) cm => com 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. 6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify: In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877 7) Line 113, use range or xrange, and use singular "axis" not "axes" 8) Line 114, you double the memory by making a new array again. 9) Line 117, abs is a numpy function. 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. 12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. 15) Avoid using where! 16) use .size instead of len() for arrays 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it. One fun thing would also be to plot projections of "Ones" using these ellipsoids as sources. That would help us see them better, too!
Doron Lemez has used this method in his paper and helped me tested the accuracy of this script. I was wondering do I need to ask him if he wants his name in the credits, or just mention his name as a special thanks?
Which credits do you mean? The CREDITS file is reserved for code contributions. Feel free in your documentation to mention that he assisted you, if you feel it is warranted. Please be sure that your documentation includes the ".. sectionauthor" directive to indicate that you wrote it. Please issue a pull request once you have included this routine into the halo finding module; we can directly review it then. Additionally, for the documentation modifications, I would encourage you to fork yt-doc and make your changes in source/analysis_modules/ under the halo finding section. Include a few (small file size) images, too. Thanks, GS! -Matt
One thing he did in the paper that was especially useful is defined the starting sphere to have 2x the virial radius. Without that specification, I run into non-convergence sometimes, where after many iterations I end up with no particles. And I want to mention again that there's no guarantee that any or all the haloes will converge with this method. I tried to adhere to the PEP8 formatting, but old habits die hard so I'm sure I made mistakes in the formatting, let me know if you see them. From G.S. _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Geoffrey, Let me start with the end of the story: the basic code is ready for inclusion, so if you'd like, make the changes to your bitbucket repo and issue a pull request and we can bring it in. Thank you so much for your efforts! (And please also feel free to add your name to the CREDITS file when you do so. :) There were a couple things you mentioned so I'll take the time to reply to those, as well. On Fri, Sep 9, 2011 at 10:13 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Matt, Stephen, Thanks Matt for all the suggestions. I've made most of the changes, but some I wasn't quite sure what you meant: 1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. - Done 2) Iterate over the halos: - Done (I assume this is faster because it doesn't try to find which halo every time I call it?)
It's just more ... "pythonic" if you will. It does speed things up marginally for the reason you note, but it's also conceptually a bit cleaner.
3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. - Done, switched to calling halo_num to avoid ambiguity
Cool. One fun thing you can do is avoid having index variables at all by using the python function enumerate: for halo_num, halo in enumerate(halos): ... This will automatically give you an index, halo_num. You don't need to worry about it here, though.
4) cm => com - Done 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. - Switched to using list [], not sure why an array of arrays is better than list of arrays, but I take your word for it
This came up a couple times in your email, so I'll address it here. The issue comes down to 'when is an array useful'. Typically you want to use them when you are performing some operation in lockstep -- really, an array operation. However, there is a cost to creating them. For instance, when you execute this operation: my_array = na.array([ arr1, arr2, arr3 ]) You are implicitly performing the following three actions in sequence: 1) Creating a list, with three components: arr1, arr2, arr3. 2) Calling the na.array function, which will allocate space for an array that is of shape (3, arr1.size). 3) Copying the elements of arr1, arr2, arr3 into that new array, and binding it to my_array. So at the end you have consumed at least twice the memory. The first operation will use pointers to the underlying objects; the last one will not. Now, if you take it one step further: my_array = na.array([arr1**2, arr2**2, arr3**2]) You add in some additional steps: 1) Create a temporary array of size arr1.size, and filling it with arr1**2 2) Same, for arr2 3) Same, for arr3) 4) Creating list 5) Allocating (3, arr1.size) 6) Copying 7) De-allocating our temporary arrays So we have now *tripled* our memory at peak usage. Nominally, this is not usually a problem; however, if you do this many many times with very large arrays, it will slow things down. Returning to the question of, when do you want to use arrays, if you are doing something that *uses* the arrays. For instance, you need a 3D array if you want to do fast sums along an axis, or very fast iteration along all axes, or sometingl ike that. But in the code you presented, this isn't really done: almost everything is done one column at a time. So the cost, of time and peak memory usage, is not *necessarily* worth it. That being said, it's a small point.
6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify:
In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877
- Not quite sure was meant by the example
Ah, just that if you fancy index something, you can modify it in place and on the backend it will copy the changed values.
7) Line 113, use range or xrange, and use singular "axis" not "axes" - Done 8) Line 114, you double the memory by making a new array again. - Done, switched to list 9) Line 117, abs is a numpy function. - Done, switched to na.abs() 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. - Old code left from my using copy and paste from the testing phase. Most things if used only once I avoid using memory by calling it a name 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. - Not changed, since this is the fastest way I can think of. Since I realized the memory requirement of particles is much less than fields, memory should not be the main concern here but speed. But I'm all ears if you or Stephen have a faster/efficient way.
Nah, this is fine.
12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. - Not done 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. - Done 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. - Done 15) Avoid using where! - Done 16) use .size instead of len() for arrays - Done 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it. - Will look into how to write an error class in YT..., very useful I think but want to see if the changes I have so far is adequate.
They're adequate. :) We can add in an error at the end after it comes in. Time to bring the code into yt and test it out in the wild.
updated script: http://paste.enzotools.org/show/1799/ So the ones I have trouble with are #6, 10, 11, all involving the phrase "in place". I had to wiki the phrase because I'm not familiar with it, and I think you mean "input is overwritten by output" as per http://en.wikipedia.org/wiki/In-place_algorithm
Ah, in-place being something like: a += 1 But more than that, if you want to do it extra-quickly, you can do operations like: na.add(a, 1, a) where you specify input, operand, output. But don't worry about it.
When I solve for the 3 different cases of boundary condition, I had to double (1/3 axes * 3 cases = 1 extra) the particle position memory, because I only check 1 axis at a time, so it's not as bad as if I do all 3 axes and 3 cases at once. I can paste the code bit of "cases" into the next line where it is used twice by na.choose(). That should make the code "in place" but it would be a lot less readable. And since "cases" is used twice, having it in memory also makes things easier. I just wanted to clarify why it is left the way it is, but if you still think the in place method is better, I can make the change easily. Another point which I didn't understand was the difference between using array of array compare to list of arrays. From what I read http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use people seem to prefer list over array, and use arrays only when the same data type is absolutely require. Another point is that arrays are more memory efficient, because lists uses extra byte incase we want to append things to it etc, but if we want to do math to it or modify it, it is better use lists. In my case I multiply the rotation matrix to the xys positions, so I should use list because of doing math. Is that the correct understanding?
You are correct, yes, lists in general are actually less memory efficient. For instance, we would never want our entire particle arrays in lists. But the difference that I was trying to get at is that lists can operate with pointers -- arrays cannot. If you create a new array out of old arrays, they will be copied into a new location in memory; with lists, only pointers to those objects are used. Does that make sense? Anyway, thanks very much -- for your work, and your research on the topic. :) Issue a pull request when you're ready; this will be a great addition. -Matt
From G.S. ---------- Forwarded message ---------- From: Matthew Turk <matthewturk@gmail.com> Date: Tue, Sep 6, 2011 at 4:11 AM Subject: Re: [Yt-dev] ellipsoid part 2 To: yt-dev@lists.spacepope.org
Hi Geoffrey,
Great to hear from you again on this project! I have several comments about the code interspersed; they are all relatively minor, but there are several.
On Mon, Sep 5, 2011 at 10:23 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Hello Stephen, This is part 2 of the ellipsoid parameter finding. The method is outlined mainly in the Dubinski and Carlberg 1991 paper. But since we are doing this in YT and we already know the particle list to begin with, I put in the option that the user can specify if they want all the particles in the region to be taken into account or only the particles found by the halo finder algorithm. This is an option that I think should be amended to the first ellipsoid parameter finder.
Is this the parameter halo_only? I would like to note that creation_time < 0 is not reliable, and will make your code Enzo-specific. I recommend for now using the boolean selector IsStar:
dm_index = dd["IsStar"]
Additionally, I would recommend not constructing a new array (which will be slower than allowing the array selections to be left as is) and to instead set ax (which should be renamed, as 'ax' is idiomatically used in yt to reference an axis) to a list.
I would go a bit further and suggest you not use the parameter halo_only as it is implicit in the halo finding. If the halo finding subselected on dark matter, then you would want that in the ellipsoid. If it didn't subselect, then you probably also don't want to subselect in the ellipsoid finding.
The script form is linked here: http://paste.enzotools.org/show/1782/ Once I have written out the methodology for myself, I'll work on the YT documentation, will probably need some help in this area when it comes to it.
I have a few comments:
1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. 2) Iterate over the halos:
for halo in halos: 3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. 4) cm => com 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. 6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify:
In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877
7) Line 113, use range or xrange, and use singular "axis" not "axes" 8) Line 114, you double the memory by making a new array again. 9) Line 117, abs is a numpy function. 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. 12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. 15) Avoid using where! 16) use .size instead of len() for arrays 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it.
One fun thing would also be to plot projections of "Ones" using these ellipsoids as sources. That would help us see them better, too!
Doron Lemez has used this method in his paper and helped me tested the accuracy of this script. I was wondering do I need to ask him if he wants his name in the credits, or just mention his name as a special thanks?
Which credits do you mean? The CREDITS file is reserved for code contributions. Feel free in your documentation to mention that he assisted you, if you feel it is warranted. Please be sure that your documentation includes the ".. sectionauthor" directive to indicate that you wrote it.
Please issue a pull request once you have included this routine into the halo finding module; we can directly review it then.
Additionally, for the documentation modifications, I would encourage you to fork yt-doc and make your changes in source/analysis_modules/ under the halo finding section. Include a few (small file size) images, too.
Thanks, GS!
-Matt
One thing he did in the paper that was especially useful is defined the starting sphere to have 2x the virial radius. Without that specification, I run into non-convergence sometimes, where after many iterations I end up with no particles. And I want to mention again that there's no guarantee that any or all the haloes will converge with this method. I tried to adhere to the PEP8 formatting, but old habits die hard so I'm sure I made mistakes in the formatting, let me know if you see them. From G.S. _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hello Stephen, Matt, I think I'm going to have to ask Stephen to insert the changes into halo_object.py again, this time for the Tensor method. The latest script is accessible at: http://paste.enzotools.org/show/1802/ I've made some small changes to halo_object.py in the ellipsoid fork of YT so that the calculation of particle positions is only done once for the Geometry method (used to be twice, once for furthest particle, once for tB_index). It also uses the same syntax as the Tensor method with boundary conditions taken into account. Let me know if there's something obviously wrong with the changes. For the Tensor method I've added two checks for when an exception occurs, it'll stop the program and ask the user to modify the input if the situation occurs. Stephen and I talked about having the ellipsoid object also include the particle position xyz like the sphere object in YT, but that can wait until both ellipsoid-parameter-find methods are integrated. From G.S. On Sat, Sep 10, 2011 at 4:45 AM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Geoffrey,
Let me start with the end of the story: the basic code is ready for inclusion, so if you'd like, make the changes to your bitbucket repo and issue a pull request and we can bring it in. Thank you so much for your efforts! (And please also feel free to add your name to the CREDITS file when you do so. :)
There were a couple things you mentioned so I'll take the time to reply to those, as well.
On Fri, Sep 9, 2011 at 10:13 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Matt, Stephen, Thanks Matt for all the suggestions. I've made most of the changes, but some I wasn't quite sure what you meant: 1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. - Done 2) Iterate over the halos: - Done (I assume this is faster because it doesn't try to find which halo every time I call it?)
It's just more ... "pythonic" if you will. It does speed things up marginally for the reason you note, but it's also conceptually a bit cleaner.
3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. - Done, switched to calling halo_num to avoid ambiguity
Cool. One fun thing you can do is avoid having index variables at all by using the python function enumerate:
for halo_num, halo in enumerate(halos): ...
This will automatically give you an index, halo_num. You don't need to worry about it here, though.
4) cm => com - Done 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. - Switched to using list [], not sure why an array of arrays is better than list of arrays, but I take your word for it
This came up a couple times in your email, so I'll address it here. The issue comes down to 'when is an array useful'. Typically you want to use them when you are performing some operation in lockstep -- really, an array operation. However, there is a cost to creating them. For instance, when you execute this operation:
my_array = na.array([ arr1, arr2, arr3 ])
You are implicitly performing the following three actions in sequence:
1) Creating a list, with three components: arr1, arr2, arr3. 2) Calling the na.array function, which will allocate space for an array that is of shape (3, arr1.size). 3) Copying the elements of arr1, arr2, arr3 into that new array, and binding it to my_array.
So at the end you have consumed at least twice the memory. The first operation will use pointers to the underlying objects; the last one will not. Now, if you take it one step further:
my_array = na.array([arr1**2, arr2**2, arr3**2])
You add in some additional steps:
1) Create a temporary array of size arr1.size, and filling it with arr1**2 2) Same, for arr2 3) Same, for arr3) 4) Creating list 5) Allocating (3, arr1.size) 6) Copying 7) De-allocating our temporary arrays
So we have now *tripled* our memory at peak usage. Nominally, this is not usually a problem; however, if you do this many many times with very large arrays, it will slow things down.
Returning to the question of, when do you want to use arrays, if you are doing something that *uses* the arrays. For instance, you need a 3D array if you want to do fast sums along an axis, or very fast iteration along all axes, or sometingl ike that. But in the code you presented, this isn't really done: almost everything is done one column at a time. So the cost, of time and peak memory usage, is not *necessarily* worth it. That being said, it's a small point.
6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify:
In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877
- Not quite sure was meant by the example
Ah, just that if you fancy index something, you can modify it in place and on the backend it will copy the changed values.
7) Line 113, use range or xrange, and use singular "axis" not "axes" - Done 8) Line 114, you double the memory by making a new array again. - Done, switched to list 9) Line 117, abs is a numpy function. - Done, switched to na.abs() 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. - Old code left from my using copy and paste from the testing phase. Most things if used only once I avoid using memory by calling it a name 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. - Not changed, since this is the fastest way I can think of. Since I realized the memory requirement of particles is much less than fields, memory should not be the main concern here but speed. But I'm all ears if you or Stephen have a faster/efficient way.
Nah, this is fine.
12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. - Not done 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. - Done 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. - Done 15) Avoid using where! - Done 16) use .size instead of len() for arrays - Done 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it. - Will look into how to write an error class in YT..., very useful I think but want to see if the changes I have so far is adequate.
They're adequate. :) We can add in an error at the end after it comes in. Time to bring the code into yt and test it out in the wild.
updated script: http://paste.enzotools.org/show/1799/ So the ones I have trouble with are #6, 10, 11, all involving the phrase
"in
place". I had to wiki the phrase because I'm not familiar with it, and I think you mean "input is overwritten by output" as per http://en.wikipedia.org/wiki/In-place_algorithm
Ah, in-place being something like:
a += 1
But more than that, if you want to do it extra-quickly, you can do operations like:
na.add(a, 1, a)
where you specify input, operand, output. But don't worry about it.
When I solve for the 3 different cases of boundary condition, I had to double (1/3 axes * 3 cases = 1 extra) the particle position memory, because I only check 1 axis at a time, so it's not as bad as if I do all 3 axes and 3 cases at once. I can paste the code bit of "cases" into the next line where it is used twice by na.choose(). That should make the code "in place" but it would be a lot less readable. And since "cases" is used twice, having it in memory also makes things easier. I just wanted to clarify why it is left the way it is, but if you still think the in place method is better, I can make the change easily. Another point which I didn't understand was the difference between using array of array compare to list of arrays. From what I read
people seem to prefer list over array, and use arrays only when the same data type is absolutely require. Another point is that arrays are more memory efficient, because lists uses extra byte incase we want to append things to it etc, but if we want to do math to it or modify it, it is better use lists. In my case I multiply the rotation matrix to the xys
http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use positions,
so I should use list because of doing math. Is that the correct understanding?
You are correct, yes, lists in general are actually less memory efficient. For instance, we would never want our entire particle arrays in lists. But the difference that I was trying to get at is that lists can operate with pointers -- arrays cannot. If you create a new array out of old arrays, they will be copied into a new location in memory; with lists, only pointers to those objects are used. Does that make sense?
Anyway, thanks very much -- for your work, and your research on the topic. :) Issue a pull request when you're ready; this will be a great addition.
-Matt
From G.S. ---------- Forwarded message ---------- From: Matthew Turk <matthewturk@gmail.com> Date: Tue, Sep 6, 2011 at 4:11 AM Subject: Re: [Yt-dev] ellipsoid part 2 To: yt-dev@lists.spacepope.org
Hi Geoffrey,
Great to hear from you again on this project! I have several comments about the code interspersed; they are all relatively minor, but there are several.
Hello Stephen, This is part 2 of the ellipsoid parameter finding. The method is outlined mainly in the Dubinski and Carlberg 1991 paper. But since we are doing this in YT and we already know the particle list to begin with, I put in the option that the user can specify if they want all the particles in the region to be taken into account or only the particles found by the halo finder algorithm. This is an option that I think should be amended to
On Mon, Sep 5, 2011 at 10:23 PM, Geoffrey So <gsiisg@gmail.com> wrote: the
first ellipsoid parameter finder.
Is this the parameter halo_only? I would like to note that creation_time < 0 is not reliable, and will make your code Enzo-specific. I recommend for now using the boolean selector IsStar:
dm_index = dd["IsStar"]
Additionally, I would recommend not constructing a new array (which will be slower than allowing the array selections to be left as is) and to instead set ax (which should be renamed, as 'ax' is idiomatically used in yt to reference an axis) to a list.
I would go a bit further and suggest you not use the parameter halo_only as it is implicit in the halo finding. If the halo finding subselected on dark matter, then you would want that in the ellipsoid. If it didn't subselect, then you probably also don't want to subselect in the ellipsoid finding.
The script form is linked here: http://paste.enzotools.org/show/1782/ Once I have written out the methodology for myself, I'll work on the YT documentation, will probably need some help in this area when it comes to it.
I have a few comments:
1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. 2) Iterate over the halos:
for halo in halos: 3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. 4) cm => com 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. 6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify:
In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877
7) Line 113, use range or xrange, and use singular "axis" not "axes" 8) Line 114, you double the memory by making a new array again. 9) Line 117, abs is a numpy function. 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. 12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. 15) Avoid using where! 16) use .size instead of len() for arrays 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it.
One fun thing would also be to plot projections of "Ones" using these ellipsoids as sources. That would help us see them better, too!
Doron Lemez has used this method in his paper and helped me tested the accuracy of this script. I was wondering do I need to ask him if he wants his name in the credits, or just mention his name as a special thanks?
Which credits do you mean? The CREDITS file is reserved for code contributions. Feel free in your documentation to mention that he assisted you, if you feel it is warranted. Please be sure that your documentation includes the ".. sectionauthor" directive to indicate that you wrote it.
Please issue a pull request once you have included this routine into the halo finding module; we can directly review it then.
Additionally, for the documentation modifications, I would encourage you to fork yt-doc and make your changes in source/analysis_modules/ under the halo finding section. Include a few (small file size) images, too.
Thanks, GS!
-Matt
One thing he did in the paper that was especially useful is defined the starting sphere to have 2x the virial radius. Without that specification, I run into non-convergence sometimes, where after many iterations I end up with no particles. And I want to mention again that there's no guarantee that any or all the haloes will converge with this method. I tried to adhere to the PEP8 formatting, but old habits die hard so I'm sure I made mistakes in the formatting, let me know if you see them. From G.S. _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hey Geoffrey, On Tue, Sep 13, 2011 at 8:00 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Hello Stephen, Matt, I think I'm going to have to ask Stephen to insert the changes into halo_object.py again, this time for the Tensor method. The latest script is accessible at: http://paste.enzotools.org/show/1802/ I've made some small changes to halo_object.py in the ellipsoid fork of YT so that the calculation of particle positions is only done once for the Geometry method (used to be twice, once for furthest particle, once for tB_index). It also uses the same syntax as the Tensor method with boundary conditions taken into account. Let me know if there's something obviously wrong with the changes.
The new script looks good. Geoffrey, please issue a pull request when you have made the necessary changes to your fork.
For the Tensor method I've added two checks for when an exception occurs, it'll stop the program and ask the user to modify the input if the situation occurs. Stephen and I talked about having the ellipsoid object also include the particle position xyz like the sphere object in YT, but that can wait until both ellipsoid-parameter-find methods are integrated.
Ellipsoid objects will find the particles if you have implemented the cut mask and grid identification methods. An optimized IO method will made it faster, but they will already work using the FakeGridForParticles proxy. -Matt
From G.S. On Sat, Sep 10, 2011 at 4:45 AM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Geoffrey,
Let me start with the end of the story: the basic code is ready for inclusion, so if you'd like, make the changes to your bitbucket repo and issue a pull request and we can bring it in. Thank you so much for your efforts! (And please also feel free to add your name to the CREDITS file when you do so. :)
There were a couple things you mentioned so I'll take the time to reply to those, as well.
On Fri, Sep 9, 2011 at 10:13 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Matt, Stephen, Thanks Matt for all the suggestions. I've made most of the changes, but some I wasn't quite sure what you meant: 1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. - Done 2) Iterate over the halos: - Done (I assume this is faster because it doesn't try to find which halo every time I call it?)
It's just more ... "pythonic" if you will. It does speed things up marginally for the reason you note, but it's also conceptually a bit cleaner.
3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. - Done, switched to calling halo_num to avoid ambiguity
Cool. One fun thing you can do is avoid having index variables at all by using the python function enumerate:
for halo_num, halo in enumerate(halos): ...
This will automatically give you an index, halo_num. You don't need to worry about it here, though.
4) cm => com - Done 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. - Switched to using list [], not sure why an array of arrays is better than list of arrays, but I take your word for it
This came up a couple times in your email, so I'll address it here. The issue comes down to 'when is an array useful'. Typically you want to use them when you are performing some operation in lockstep -- really, an array operation. However, there is a cost to creating them. For instance, when you execute this operation:
my_array = na.array([ arr1, arr2, arr3 ])
You are implicitly performing the following three actions in sequence:
1) Creating a list, with three components: arr1, arr2, arr3. 2) Calling the na.array function, which will allocate space for an array that is of shape (3, arr1.size). 3) Copying the elements of arr1, arr2, arr3 into that new array, and binding it to my_array.
So at the end you have consumed at least twice the memory. The first operation will use pointers to the underlying objects; the last one will not. Now, if you take it one step further:
my_array = na.array([arr1**2, arr2**2, arr3**2])
You add in some additional steps:
1) Create a temporary array of size arr1.size, and filling it with arr1**2 2) Same, for arr2 3) Same, for arr3) 4) Creating list 5) Allocating (3, arr1.size) 6) Copying 7) De-allocating our temporary arrays
So we have now *tripled* our memory at peak usage. Nominally, this is not usually a problem; however, if you do this many many times with very large arrays, it will slow things down.
Returning to the question of, when do you want to use arrays, if you are doing something that *uses* the arrays. For instance, you need a 3D array if you want to do fast sums along an axis, or very fast iteration along all axes, or sometingl ike that. But in the code you presented, this isn't really done: almost everything is done one column at a time. So the cost, of time and peak memory usage, is not *necessarily* worth it. That being said, it's a small point.
6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify:
In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877
- Not quite sure was meant by the example
Ah, just that if you fancy index something, you can modify it in place and on the backend it will copy the changed values.
7) Line 113, use range or xrange, and use singular "axis" not "axes" - Done 8) Line 114, you double the memory by making a new array again. - Done, switched to list 9) Line 117, abs is a numpy function. - Done, switched to na.abs() 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. - Old code left from my using copy and paste from the testing phase. Most things if used only once I avoid using memory by calling it a name 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. - Not changed, since this is the fastest way I can think of. Since I realized the memory requirement of particles is much less than fields, memory should not be the main concern here but speed. But I'm all ears if you or Stephen have a faster/efficient way.
Nah, this is fine.
12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. - Not done 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. - Done 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. - Done 15) Avoid using where! - Done 16) use .size instead of len() for arrays - Done 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it. - Will look into how to write an error class in YT..., very useful I think but want to see if the changes I have so far is adequate.
They're adequate. :) We can add in an error at the end after it comes in. Time to bring the code into yt and test it out in the wild.
updated script: http://paste.enzotools.org/show/1799/ So the ones I have trouble with are #6, 10, 11, all involving the phrase "in place". I had to wiki the phrase because I'm not familiar with it, and I think you mean "input is overwritten by output" as per http://en.wikipedia.org/wiki/In-place_algorithm
Ah, in-place being something like:
a += 1
But more than that, if you want to do it extra-quickly, you can do operations like:
na.add(a, 1, a)
where you specify input, operand, output. But don't worry about it.
When I solve for the 3 different cases of boundary condition, I had to double (1/3 axes * 3 cases = 1 extra) the particle position memory, because I only check 1 axis at a time, so it's not as bad as if I do all 3 axes and 3 cases at once. I can paste the code bit of "cases" into the next line where it is used twice by na.choose(). That should make the code "in place" but it would be a lot less readable. And since "cases" is used twice, having it in memory also makes things easier. I just wanted to clarify why it is left the way it is, but if you still think the in place method is better, I can make the change easily. Another point which I didn't understand was the difference between using array of array compare to list of arrays. From what I read
http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use people seem to prefer list over array, and use arrays only when the same data type is absolutely require. Another point is that arrays are more memory efficient, because lists uses extra byte incase we want to append things to it etc, but if we want to do math to it or modify it, it is better use lists. In my case I multiply the rotation matrix to the xys positions, so I should use list because of doing math. Is that the correct understanding?
You are correct, yes, lists in general are actually less memory efficient. For instance, we would never want our entire particle arrays in lists. But the difference that I was trying to get at is that lists can operate with pointers -- arrays cannot. If you create a new array out of old arrays, they will be copied into a new location in memory; with lists, only pointers to those objects are used. Does that make sense?
Anyway, thanks very much -- for your work, and your research on the topic. :) Issue a pull request when you're ready; this will be a great addition.
-Matt
From G.S. ---------- Forwarded message ---------- From: Matthew Turk <matthewturk@gmail.com> Date: Tue, Sep 6, 2011 at 4:11 AM Subject: Re: [Yt-dev] ellipsoid part 2 To: yt-dev@lists.spacepope.org
Hi Geoffrey,
Great to hear from you again on this project! I have several comments about the code interspersed; they are all relatively minor, but there are several.
On Mon, Sep 5, 2011 at 10:23 PM, Geoffrey So <gsiisg@gmail.com> wrote:
Hello Stephen, This is part 2 of the ellipsoid parameter finding. The method is outlined mainly in the Dubinski and Carlberg 1991 paper. But since we are doing this in YT and we already know the particle list to begin with, I put in the option that the user can specify if they want all the particles in the region to be taken into account or only the particles found by the halo finder algorithm. This is an option that I think should be amended to the first ellipsoid parameter finder.
Is this the parameter halo_only? I would like to note that creation_time < 0 is not reliable, and will make your code Enzo-specific. I recommend for now using the boolean selector IsStar:
dm_index = dd["IsStar"]
Additionally, I would recommend not constructing a new array (which will be slower than allowing the array selections to be left as is) and to instead set ax (which should be renamed, as 'ax' is idiomatically used in yt to reference an axis) to a list.
I would go a bit further and suggest you not use the parameter halo_only as it is implicit in the halo finding. If the halo finding subselected on dark matter, then you would want that in the ellipsoid. If it didn't subselect, then you probably also don't want to subselect in the ellipsoid finding.
The script form is linked here: http://paste.enzotools.org/show/1782/ Once I have written out the methodology for myself, I'll work on the YT documentation, will probably need some help in this area when it comes to it.
I have a few comments:
1) Use booleans for parameters such as use_halos. True and False, not 0 and 1. 2) Iterate over the halos:
for halo in halos: 3) Index variables should be clearly index variables; "halo" is an object, but you use it as an index. 4) cm => com 5) You set ax to be an array twice, which will double the memory requirements. I don't think ax needs to be an array at all. 6) Line 109, just modify ax in place. Fancy-indexing in numpy will (I believe) implicitly make a copy when you modify:
In [1]: import numpy as na In [2]: arr = na.random.random(100) In [3]: ai = arr > 0.8 In [4]: bb = arr[ai] In [5]: bb -= 1.0 In [6]: arr.max() Out[6]: 0.99642788644174751 In [7]: bb.max() Out[7]: -0.0035721135582524877
7) Line 113, use range or xrange, and use singular "axis" not "axes" 8) Line 114, you double the memory by making a new array again. 9) Line 117, abs is a numpy function. 10) Line 121, this makes four new arrays, each of which is the length of the number of particles. Construct a temporary array and do in-place math. You don't need "where" here; use booleans. 11) Your calculation of the inertial tensor is nice, but I would again recommend that you avoid creating four temporary arrays when calculating the magnitude of 'ax'. 12) I would encourage you to investigate the eigenvector code if you suspect it gives LH system coordinates. 13) "for iter in range(0, maxiterations)") should just be for iter in xrange(maxiterations). There is no need to specify 0. 14) Please use print without parenthesis. Python 3 has transformed print into a function, but the formatting rules are also different. 15) Avoid using where! 16) use .size instead of len() for arrays 17) Feel free to throw an error. Define a subclass of Exception and then raise an instance of it.
One fun thing would also be to plot projections of "Ones" using these ellipsoids as sources. That would help us see them better, too!
Doron Lemez has used this method in his paper and helped me tested the accuracy of this script. I was wondering do I need to ask him if he wants his name in the credits, or just mention his name as a special thanks?
Which credits do you mean? The CREDITS file is reserved for code contributions. Feel free in your documentation to mention that he assisted you, if you feel it is warranted. Please be sure that your documentation includes the ".. sectionauthor" directive to indicate that you wrote it.
Please issue a pull request once you have included this routine into the halo finding module; we can directly review it then.
Additionally, for the documentation modifications, I would encourage you to fork yt-doc and make your changes in source/analysis_modules/ under the halo finding section. Include a few (small file size) images, too.
Thanks, GS!
-Matt
One thing he did in the paper that was especially useful is defined the starting sphere to have 2x the virial radius. Without that specification, I run into non-convergence sometimes, where after many iterations I end up with no particles. And I want to mention again that there's no guarantee that any or all the haloes will converge with this method. I tried to adhere to the PEP8 formatting, but old habits die hard so I'm sure I made mistakes in the formatting, let me know if you see them. From G.S. _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Geoffrey,
I think I'm going to have to ask Stephen to insert the changes into halo_object.py again, this time for the Tensor method.
To be clear, is this a complimentary method to what we did before, or a replacement? -- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice)
This is a complimentary method, in the not so immediate future I may put in more. Currently I call the method "geometry" for the one you already put in because it involves just simply drawing ellipses in 3d. The latest script is what I called the "tensor" method because it involves calculating a reduced moment of inertia tensor. Maybe in halo_object.py you can have get_geo_ellipsoid_param and get_tens_ellipsoid_param, or have a user input method='geometry' or 'tensor'. What do you think? Let me know if you have any other ideas on how to integrate them. I have a few other things I may do to the geometry method but they are pending the results after the integration. From G.S. Sent from my iPhone On Sep 15, 2011, at 6:38 AM, Stephen Skory <s@skory.us> wrote:
Geoffrey,
I think I'm going to have to ask Stephen to insert the changes into halo_object.py again, this time for the Tensor method.
To be clear, is this a complimentary method to what we did before, or a replacement?
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (3)
-
Geoffrey So
-
Matthew Turk
-
Stephen Skory