Hi all, I've spent a bit of time today comparing the current tip of 3.0 to the current unitrefactor, in an attempt to see whether field accesses on data objects will return answers that are consistent with what we were returning before. The specific comparison I'm doing is based on the IsolatedGalaxy dataset. This dataset has a large number of on-disk fields and is a good exercise of yt's field detection and derived fields machinery. I'm looking at whether field accesses on data objects return results that are identical to the results we get in the current 3.0 tip. The full testing script has been pasted here: http://paste.yt-project.org/show/4250 The results (just what the script printed to my terminal) are pasted here: http://paste.yt-project.org/show/4251 There appear to be three things that happened in this test: 1. The field access returned bitwise identical results 2. The field access returned results that are slightly different (~1e-6 level). 3. The field access returned results that are order-unity different. Obviously the first case isn't an issue. I believe all instances of the third case are due to comparing quantities that have different units (for example, particle_position_[x,y,z] used to return data in code units, but now does so in CGS). Case 2 is a little bit more involved and I'm not sure what to do about it in general -- thus this message ;) I'll take the density field as an example to illustrate what is happening. In the parameter file for this dataset, I see the following: MassUnits = 8.11471e+43 DensityUnits = 2.76112e-30 TimeUnits = 2.32946e+18 LengthUnits = 3.086e+24 Right now, unitrefactor is using the values of MassUnits and LengthUnits to construct the YTArray that contains the field data for the ('enzo', 'Density') field since this field has units of `code_mass/code_length**3`. The conversion factors from `code_mass` and `code_length` to CGS are exactly the MassUnits and LengthUnits variables from the dataset parameter file. So, the CGS conversion factor for the array is: MassUnits / LengthUnits**3 = 2.761197257954595e-30 g / cm**3 A careful reader will note that this is only equal to the DensityUnits in the parameter file up to rounding in the fifth decimal place. This difference, it turns out, is the source of the differences I saw in the comparison. I haven't gone through the rest of the fields in detail, but i suspect the all the remaining fields showing small differences to all have issues of this sort. My guess is the fields that compare exactly will either have a CGS conversion factor of unity, or the conversion factor was derived directly from MassUnits, LengthUnits, and TimeUnits in the data file. I'm not sure what the proper way to handle this is. Right now unitrefactor is ignoring the on-disk CGS conversion factor for Density. I don't think this choice should matter in principle, since mass and density units should be algebraically related via the LengthUnits. Unfortunately, rounding errors in the parameter file mean there will likely be a small amount of disagreement. Should we attempt to be more fastidious about adopting the proper conversion factors in the enzo frontend, if a conversion factor is available on disk? Should we not worry about these minor differences in field accesses that unitrefactor is introducing? Should we have units like `code_density` for all on-disk fields with available CGS conversion factors? If so, how do we deal with the fact that the conversion from code_density to code_mass/code_length**3 is not straightforward? Thanks for any help or advice anyone who has gotten this far can provide :) Cheers, Nathan
Hi Nathan, On Mon, Jan 27, 2014 at 7:32 PM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi all,
I've spent a bit of time today comparing the current tip of 3.0 to the current unitrefactor, in an attempt to see whether field accesses on data objects will return answers that are consistent with what we were returning before.
The specific comparison I'm doing is based on the IsolatedGalaxy dataset. This dataset has a large number of on-disk fields and is a good exercise of yt's field detection and derived fields machinery. I'm looking at whether field accesses on data objects return results that are identical to the results we get in the current 3.0 tip.
The full testing script has been pasted here: http://paste.yt-project.org/show/4250
The results (just what the script printed to my terminal) are pasted here: http://paste.yt-project.org/show/4251
There appear to be three things that happened in this test:
1. The field access returned bitwise identical results 2. The field access returned results that are slightly different (~1e-6 level). 3. The field access returned results that are order-unity different.
Obviously the first case isn't an issue. I believe all instances of the third case are due to comparing quantities that have different units (for example, particle_position_[x,y,z] used to return data in code units, but now does so in CGS).
Case 2 is a little bit more involved and I'm not sure what to do about it in general -- thus this message ;)
I'll take the density field as an example to illustrate what is happening.
In the parameter file for this dataset, I see the following:
MassUnits = 8.11471e+43 DensityUnits = 2.76112e-30 TimeUnits = 2.32946e+18 LengthUnits = 3.086e+24
Right now, unitrefactor is using the values of MassUnits and LengthUnits to construct the YTArray that contains the field data for the ('enzo', 'Density') field since this field has units of `code_mass/code_length**3`. The conversion factors from `code_mass` and `code_length` to CGS are exactly the MassUnits and LengthUnits variables from the dataset parameter file. So, the CGS conversion factor for the array is:
MassUnits / LengthUnits**3 = 2.761197257954595e-30 g / cm**3
A careful reader will note that this is only equal to the DensityUnits in the parameter file up to rounding in the fifth decimal place. This difference, it turns out, is the source of the differences I saw in the comparison.
I haven't gone through the rest of the fields in detail, but i suspect the all the remaining fields showing small differences to all have issues of this sort. My guess is the fields that compare exactly will either have a CGS conversion factor of unity, or the conversion factor was derived directly from MassUnits, LengthUnits, and TimeUnits in the data file.
I'm not sure what the proper way to handle this is.
Right now unitrefactor is ignoring the on-disk CGS conversion factor for Density. I don't think this choice should matter in principle, since mass and density units should be algebraically related via the LengthUnits. Unfortunately, rounding errors in the parameter file mean there will likely be a small amount of disagreement.
Should we attempt to be more fastidious about adopting the proper conversion factors in the enzo frontend, if a conversion factor is available on disk? Should we not worry about these minor differences in field accesses that unitrefactor is introducing? Should we have units like `code_density` for all on-disk fields with available CGS conversion factors? If so, how do we deal with the fact that the conversion from code_density to code_mass/code_length**3 is not straightforward?
Thanks for any help or advice anyone who has gotten this far can provide :)
I've done some thinking about this and also talked to Greg about it. For the specific case of Enzo, I think using the primitive units output in the file and then computing our own Density, etc, is the way to go. The computed versions inside the parameter file are just computed from the primitive units, and so I think we are safe there. It sounds like the Enzo units are all working properly? If so, that's a big +1 from me. -Matt
Cheers,
Nathan _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
On Tue, Jan 28, 2014 at 10:20 AM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Nathan,
On Mon, Jan 27, 2014 at 7:32 PM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi all,
I've spent a bit of time today comparing the current tip of 3.0 to the current unitrefactor, in an attempt to see whether field accesses on data objects will return answers that are consistent with what we were returning before.
The specific comparison I'm doing is based on the IsolatedGalaxy dataset. This dataset has a large number of on-disk fields and is a good exercise of yt's field detection and derived fields machinery. I'm looking at whether field accesses on data objects return results that are identical to the results we get in the current 3.0 tip.
The full testing script has been pasted here: http://paste.yt-project.org/show/4250
The results (just what the script printed to my terminal) are pasted here: http://paste.yt-project.org/show/4251
There appear to be three things that happened in this test:
1. The field access returned bitwise identical results 2. The field access returned results that are slightly different (~1e-6 level). 3. The field access returned results that are order-unity different.
Obviously the first case isn't an issue. I believe all instances of the third case are due to comparing quantities that have different units (for example, particle_position_[x,y,z] used to return data in code units, but now does so in CGS).
Case 2 is a little bit more involved and I'm not sure what to do about it in general -- thus this message ;)
I'll take the density field as an example to illustrate what is happening.
In the parameter file for this dataset, I see the following:
MassUnits = 8.11471e+43 DensityUnits = 2.76112e-30 TimeUnits = 2.32946e+18 LengthUnits = 3.086e+24
Right now, unitrefactor is using the values of MassUnits and LengthUnits to construct the YTArray that contains the field data for the ('enzo', 'Density') field since this field has units of `code_mass/code_length**3`. The conversion factors from `code_mass` and `code_length` to CGS are exactly the MassUnits and LengthUnits variables from the dataset parameter file. So, the CGS conversion factor for the array is:
MassUnits / LengthUnits**3 = 2.761197257954595e-30 g / cm**3
A careful reader will note that this is only equal to the DensityUnits in the parameter file up to rounding in the fifth decimal place. This difference, it turns out, is the source of the differences I saw in the comparison.
I haven't gone through the rest of the fields in detail, but i suspect the all the remaining fields showing small differences to all have issues of this sort. My guess is the fields that compare exactly will either have a CGS conversion factor of unity, or the conversion factor was derived directly from MassUnits, LengthUnits, and TimeUnits in the data file.
I'm not sure what the proper way to handle this is.
Right now unitrefactor is ignoring the on-disk CGS conversion factor for Density. I don't think this choice should matter in principle, since mass and density units should be algebraically related via the LengthUnits. Unfortunately, rounding errors in the parameter file mean there will likely be a small amount of disagreement.
Should we attempt to be more fastidious about adopting the proper conversion factors in the enzo frontend, if a conversion factor is available on disk? Should we not worry about these minor differences in field accesses that unitrefactor is introducing? Should we have units like `code_density` for all on-disk fields with available CGS conversion factors? If so, how do we deal with the fact that the conversion from code_density to code_mass/code_length**3 is not straightforward?
Thanks for any help or advice anyone who has gotten this far can provide :)
I've done some thinking about this and also talked to Greg about it. For the specific case of Enzo, I think using the primitive units output in the file and then computing our own Density, etc, is the way to go. The computed versions inside the parameter file are just computed from the primitive units, and so I think we are safe there.
It sounds like the Enzo units are all working properly? If so, that's a big +1 from me.
I looked over this in detail last night. It looks like all the remaining mismatches were due to the issues I highlighted in my last e-mail. So yes, Enzo units are all working properly! It may be worthwhile going through this exercise with the rest of the test datasets to see if there are any more frontend-specific issues. The testing script I used (http://paste.yt-project.org/show/4250/) should be useful for that purpose. I'm going to focus on other things for now, but we should come back to this before we merge unitrefactor into the main codebase. -Nathan
-Matt
Cheers,
Nathan _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (2)
-
Matthew Turk
-
Nathan Goldbaum