How to change values of yt fields

Hi everyone, The problem: I create a derived field filled with zeros: def _accel_x(field, data): return data['relative_x'] * 0.0 / yt.units.s**2 ds.add_field(('gas', 'accel_x'), units='cm/s**2', function=_accel_x, sampling_type='cell’) I then fetch all data and check that ‘accel_x’ field is indeed zeros: ad = ds.all_data() ad['accel_x’] YTArray([-0., -0., -0., ..., 0., 0., 0.]) cm/s**2 I figured I could set the values in a derived field in the following way: ad['accel_x'] = a_gas[:, 0] ad['accel_x'] = a_gas[:, 1] ad['accel_x'] = a_gas[:, 2] where a_gas is a numpy array of shape (number of cells in ad, 3). I need it because I don’t think there is a way to do define a_gas directly in a derived field. Indeed, now ‘accel_x’ is not zeros ad['accel_x’] YTArray([ 1.41986331e-11, 1.46817279e-11, 1.50849113e-11, ..., -3.98769430e-11, -4.35244006e-11, -4.74626863e-11]) cm/s**2 Unfortunately, when I need to use PhasePlot or ProjectionPlot or create_profile, the results I’m getting are such as if ‘accel_x’ was still zeros. So is there a correct, definitive way to change the values of a derived field? Thank you, Nina

Hi Nina, I think what we're seeing is something of an interaction between the way that derived fields are computed and how they are set in the objects. I'm not sure where you're getting a_gas, so it's *entirely* possible that the rest of this email is overly complicated -- there may be a *much* simpler solution. I think the main issue we're running into here is that: 1. The derived field will be called for smaller chunks than just the ad -- this is done to make sure that the computation can run on small-memory machines. So you might have the derived field being called on a file at a time for instance. 2. The setting of the fields in ad won't cascade down to the sub-chunks. If there were only one chunk, then in the derived field you could do return a_gas[:,0] for instance. But, that array won't be the same size if it's "sub-chunking" and it won't start at the same place. What we need is a way to check that you've always got the same values, which I think we can do with grid datasets (particle datasets will require a slightly different check). What we can try doing is verifying that we have unique morton index values for every value in ad, by verifying: mi = ad["index", "morton_index"].view('u8') print(mi.size, np.unique(mi).size) and checking they're the same. If they *are*, that means we have unique index values (we should). So if that's the case, we can figure out the right start point in our derived field and just grab those -- the field values should be in the same order. ad = ds.all_data() # or whatever data object you want mi_base = ad["index", "morton_index"].view('u8') a_gas = wherever_you_got_it_from() def match_gas(field, data): mi = data["index", "morton_index"].view('u8') starting_point = np.nonzero(mi_base == mi[0])[0][0] return a_gas[starting_point:starting_point + mi.size, 0] and then add match_gas as a field. I *think* this should work? The trick with np.nonzero(mi_base == mi[0])[0][0] is to say, where is this boolean relationship true (i.e., that mi[0], the first morton index, is found in the mi_base, which is the morton index for the full dataset -- this relies on the two being in the same order, which they will be) and then the [0][0] thing is an annoying numpy unpacking. (You can see this in action with v = np.arange(100); np.nonzero(v == 10) and seeing the output from that is a tuple of an array, and we want the first element of each.) Can you give this a shot and let us know how it turns out for you? On Wed, Apr 6, 2022 at 5:35 PM Nina Akerman <akermannia@gmail.com> wrote:

Thank you very much Matthew! Your idea did work, however I had to change it a bit. I’m leaving the code here so that anyone can use it: ad = ds.all_data() # or whatever data object you want mi_base = ad["index", "morton_index"].view('u8') a_gas = wherever_you_got_it_from() def match_gas(field, data): mi = data["index", "morton_index"].view('u8') mi = np.reshape(mi, mi.size) if np.count_nonzero(mi_base == mi[0]) != 0: starting_point = np.nonzero(mi_base == mi[0])[0][0] return a_gas[starting_point:starting_point + mi.size, 0] else: return data['relative_x'] The problem was that in the derived field the data are given in a different format: mi_base.shape gives (35026307,) mi.shape gives (16, 16, 16) So I had to reshape it to turn into a 1d array. I guess, the data are given in some chunks here. Then, even though mi_base included all data in the dataset, some marton indexes wouldn’t match in starting_point = np.nonzero(mi_base == mi[0])[0][0] and I would be given an empty array (array([], dtype=int64),). I took that into account by checking np.count_nonzero, and fill cells with zeros if the indexes don’t match. Again, many thanks Nina

Hi Nina, Oh! I'm glad it worked, but now I think there may be an easier solution. In this case, since the data is chunked as 3D arrays (does it have a ValidateSpatial somewhere?) then I think you might be able to match it up by comparing the values of the field ["index","grid_indices"] which will be uniform within each grid. If there are any child zones or refined zones the morton_index values might get a bit out-of-order, and this might be somewhat more reliable. I don't have an idea for the specific of how to do it off hand, but I think using grid indices would work for this case. Also, great catch with the zero checking! And thank you for posting your solution. -Matt On Fri, Apr 8, 2022 at 1:51 PM Nina Akerman <akermannia@gmail.com> wrote:

Hi Nina, I think what we're seeing is something of an interaction between the way that derived fields are computed and how they are set in the objects. I'm not sure where you're getting a_gas, so it's *entirely* possible that the rest of this email is overly complicated -- there may be a *much* simpler solution. I think the main issue we're running into here is that: 1. The derived field will be called for smaller chunks than just the ad -- this is done to make sure that the computation can run on small-memory machines. So you might have the derived field being called on a file at a time for instance. 2. The setting of the fields in ad won't cascade down to the sub-chunks. If there were only one chunk, then in the derived field you could do return a_gas[:,0] for instance. But, that array won't be the same size if it's "sub-chunking" and it won't start at the same place. What we need is a way to check that you've always got the same values, which I think we can do with grid datasets (particle datasets will require a slightly different check). What we can try doing is verifying that we have unique morton index values for every value in ad, by verifying: mi = ad["index", "morton_index"].view('u8') print(mi.size, np.unique(mi).size) and checking they're the same. If they *are*, that means we have unique index values (we should). So if that's the case, we can figure out the right start point in our derived field and just grab those -- the field values should be in the same order. ad = ds.all_data() # or whatever data object you want mi_base = ad["index", "morton_index"].view('u8') a_gas = wherever_you_got_it_from() def match_gas(field, data): mi = data["index", "morton_index"].view('u8') starting_point = np.nonzero(mi_base == mi[0])[0][0] return a_gas[starting_point:starting_point + mi.size, 0] and then add match_gas as a field. I *think* this should work? The trick with np.nonzero(mi_base == mi[0])[0][0] is to say, where is this boolean relationship true (i.e., that mi[0], the first morton index, is found in the mi_base, which is the morton index for the full dataset -- this relies on the two being in the same order, which they will be) and then the [0][0] thing is an annoying numpy unpacking. (You can see this in action with v = np.arange(100); np.nonzero(v == 10) and seeing the output from that is a tuple of an array, and we want the first element of each.) Can you give this a shot and let us know how it turns out for you? On Wed, Apr 6, 2022 at 5:35 PM Nina Akerman <akermannia@gmail.com> wrote:

Thank you very much Matthew! Your idea did work, however I had to change it a bit. I’m leaving the code here so that anyone can use it: ad = ds.all_data() # or whatever data object you want mi_base = ad["index", "morton_index"].view('u8') a_gas = wherever_you_got_it_from() def match_gas(field, data): mi = data["index", "morton_index"].view('u8') mi = np.reshape(mi, mi.size) if np.count_nonzero(mi_base == mi[0]) != 0: starting_point = np.nonzero(mi_base == mi[0])[0][0] return a_gas[starting_point:starting_point + mi.size, 0] else: return data['relative_x'] The problem was that in the derived field the data are given in a different format: mi_base.shape gives (35026307,) mi.shape gives (16, 16, 16) So I had to reshape it to turn into a 1d array. I guess, the data are given in some chunks here. Then, even though mi_base included all data in the dataset, some marton indexes wouldn’t match in starting_point = np.nonzero(mi_base == mi[0])[0][0] and I would be given an empty array (array([], dtype=int64),). I took that into account by checking np.count_nonzero, and fill cells with zeros if the indexes don’t match. Again, many thanks Nina

Hi Nina, Oh! I'm glad it worked, but now I think there may be an easier solution. In this case, since the data is chunked as 3D arrays (does it have a ValidateSpatial somewhere?) then I think you might be able to match it up by comparing the values of the field ["index","grid_indices"] which will be uniform within each grid. If there are any child zones or refined zones the morton_index values might get a bit out-of-order, and this might be somewhat more reliable. I don't have an idea for the specific of how to do it off hand, but I think using grid indices would work for this case. Also, great catch with the zero checking! And thank you for posting your solution. -Matt On Fri, Apr 8, 2022 at 1:51 PM Nina Akerman <akermannia@gmail.com> wrote:
participants (2)
-
Matthew Turk
-
Nina Akerman