Mailman 3 basic question - yt-users

basic question

older
Some questions about plot and load

Elizabeth Tasker

8 Nov 2011 8 Nov '11

8:33 a.m.

Hi, I have a basic question about the way yt handles data (or maybe the way python functions handle data). If I have a function of the form: def _MyFunction(field, data): new_data = na.zeros(data["x"].shape, dtype='float64') ..... find xpos, ypos and zpos values of cells to be flagged as part of new_data for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0 return new_data add_field("MyFunction", function=_MyFunction, validators=[ValidateSpatial(1, ["x", "y", "z"])]) pf = load('data') dd = pf.h.all_data() regions = dd["MyFunction"] Does yt call _MyFunction for each grid? So it runs through that routine many times, once per grid? If so, is there an alternative? One sweep through the function will --using Sam's KD tree-- find the positions on every grid that I want to mark. The only reason to put it in a function, as opposed to the main body, is so I can then use yt's slice and projection tools on the data set. I'd really like to create a new field I can use with add_slice that only is looped over a single time. I appreciate this line: for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0 then won't work, but I can just store a list of the grids and cell numbers and mark up new_data using those instead. Does that make sense? Thanks! Elizabeth

Show replies by date

Matthew Turk

8 Nov 8 Nov

12:56 p.m.

Hi Elizabeth, On Tue, Nov 8, 2011 at 3:33 AM, Elizabeth Tasker wrote:

...

Hi,

I have a basic question about the way yt handles data (or maybe the way python functions handle data).

If I have a function of the form:

def _MyFunction(field, data):

new_data = na.zeros(data["x"].shape, dtype='float64')

..... find xpos, ypos and zpos values of cells to be flagged as part of new_data

for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0

return new_data

add_field("MyFunction", function=_MyFunction, validators=[ValidateSpatial(1, ["x", "y", "z"])])

pf = load('data') dd = pf.h.all_data() regions = dd["MyFunction"]

Does yt call _MyFunction for each grid? So it runs through that routine many times, once per grid?

When you mandate ValidateSpatial, with a single ghost zone, it actually doesn't exactly call it for every grid. It calls it for every *covering* grid. So it's worse than that: it fills in a buffer zone of one cell through cascading interpolation, then it calculates your data field, then it returns that portion to the grid. It then masks them, concatenates them, and sets dd.field_data["MyFunction"] to the resulting flattened array; the dd["MyFunction"] call then returns this.

...

If so, is there an alternative? One sweep through the function will --using Sam's KD tree-- find the positions on every grid that I want to mark. The only reason to put it in a function, as opposed to the main body, is so I can then use yt's slice and projection tools on the data set.

Yes, you should do this then. The problem will be taking the values and turning them back into grid values, as they'll be vertex-centered and in partitioned grids. Sam can likely describe how to do that.

...

I'd really like to create a new field I can use with add_slice that only is looped over a single time. I appreciate this line:

for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0

then won't work, but I can just store a list of the grids and cell numbers and mark up new_data using those instead.

Hm, slices also have "x","y","z" values. If you don't mandate ValidateSpatial, accessing a slice field should only read the field dependencies from disk (as slices) and then calcualte. For instance, if you ask for H2I_Fraction from a slice, it will only read H2I_Density and Density in that slice (not in each grid, but in the appropriate subselection of a grid that intersects a slice) and then do the derived field function on the field_data values, not on the grid values. -Matt

...

Does that make sense?

Thanks!

Elizabeth _______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

Elizabeth Tasker

1:42 p.m.

Hi Matt, So the repeated calls of the function come from the ValidateSpatial part of the function call?

...

Yes, you should do this then. The problem will be taking the values and turning them back into grid values, as they'll be vertex-centered and in partitioned grids. Sam can likely describe how to do that.

I'm really sorry, but I don't understand. Sam has designed a routine that takes a position and gives back the grid and cell number of the neighbours (which is what I happen to need). I then access their data via grids[n]["Density"][cis[n]], I think I can then use grids[n] and cis[n] to set the value in my new_data array to be what I want? But basically, I can't create a function that I can plug into slice or projection without it running through the function a gazillion times? Elizabeth On 2011-11-08, at 9:56 PM, Matthew Turk wrote:

...

Hi Elizabeth,

On Tue, Nov 8, 2011 at 3:33 AM, Elizabeth Tasker wrote:

...
Hi,

I have a basic question about the way yt handles data (or maybe the way python functions handle data).

If I have a function of the form:

def _MyFunction(field, data):

new_data = na.zeros(data["x"].shape, dtype='float64')

..... find xpos, ypos and zpos values of cells to be flagged as part of new_data

for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0

return new_data

add_field("MyFunction", function=_MyFunction, validators=[ValidateSpatial(1, ["x", "y", "z"])])

pf = load('data') dd = pf.h.all_data() regions = dd["MyFunction"]

Does yt call _MyFunction for each grid? So it runs through that routine many times, once per grid?

When you mandate ValidateSpatial, with a single ghost zone, it actually doesn't exactly call it for every grid. It calls it for every *covering* grid. So it's worse than that: it fills in a buffer zone of one cell through cascading interpolation, then it calculates your data field, then it returns that portion to the grid. It then masks them, concatenates them, and sets dd.field_data["MyFunction"] to the resulting flattened array; the dd["MyFunction"] call then returns this.

...
If so, is there an alternative? One sweep through the function will --using Sam's KD tree-- find the positions on every grid that I want to mark. The only reason to put it in a function, as opposed to the main body, is so I can then use yt's slice and projection tools on the data set.

Yes, you should do this then. The problem will be taking the values and turning them back into grid values, as they'll be vertex-centered and in partitioned grids. Sam can likely describe how to do that.

...
I'd really like to create a new field I can use with add_slice that only is looped over a single time. I appreciate this line:

for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0

then won't work, but I can just store a list of the grids and cell numbers and mark up new_data using those instead.

Hm, slices also have "x","y","z" values. If you don't mandate ValidateSpatial, accessing a slice field should only read the field dependencies from disk (as slices) and then calcualte. For instance, if you ask for H2I_Fraction from a slice, it will only read H2I_Density and Density in that slice (not in each grid, but in the appropriate subselection of a grid that intersects a slice) and then do the derived field function on the field_data values, not on the grid values.

-Matt

...
Does that make sense?

Thanks!

Elizabeth _______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

Matthew Turk

2:19 p.m.

Hi Elizabeth, On Tue, Nov 8, 2011 at 8:42 AM, Elizabeth Tasker wrote:

...

Hi Matt,

So the repeated calls of the function come from the ValidateSpatial part of the function call?

...
Yes, you should do this then. The problem will be taking the values and turning them back into grid values, as they'll be vertex-centered and in partitioned grids. Sam can likely describe how to do that.

I'm really sorry, but I don't understand. Sam has designed a routine that takes a position and gives back the grid and cell number of the neighbours (which is what I happen to need). I then access their data via

grids[n]["Density"][cis[n]],

I think I can then use grids[n] and cis[n] to set the value in my new_data array to be what I want?

Oh, yes. You can.

...

But basically, I can't create a function that I can plug into slice or projection without it running through the function a gazillion times?

You definitely can for slices. If you define a derived field that is not ValidateSpatial, it will only be called once, on the slice. The flow looks like this: inside get_data, for every field requested (in this case a derived field), it checks to see if it can generate that field. If it can (i.e., it's not a grid-requiring or spatial-requiring field) it will do so. If the field exists in the grid already, it will not be regenerated. Unfortunately projections do require touching every grid. -Matt

...

Elizabeth

On 2011-11-08, at 9:56 PM, Matthew Turk wrote:

...
Hi Elizabeth,

On Tue, Nov 8, 2011 at 3:33 AM, Elizabeth Tasker wrote:

...
Hi,

I have a basic question about the way yt handles data (or maybe the way python functions handle data).

If I have a function of the form:

def _MyFunction(field, data):

new_data = na.zeros(data["x"].shape, dtype='float64')

..... find xpos, ypos and zpos values of cells to be flagged as part of new_data

for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0

return new_data

add_field("MyFunction", function=_MyFunction, validators=[ValidateSpatial(1, ["x", "y", "z"])])

pf = load('data') dd = pf.h.all_data() regions = dd["MyFunction"]

Does yt call _MyFunction for each grid? So it runs through that routine many times, once per grid?

When you mandate ValidateSpatial, with a single ghost zone, it actually doesn't exactly call it for every grid. It calls it for every *covering* grid. So it's worse than that: it fills in a buffer zone of one cell through cascading interpolation, then it calculates your data field, then it returns that portion to the grid. It then masks them, concatenates them, and sets dd.field_data["MyFunction"] to the resulting flattened array; the dd["MyFunction"] call then returns this.

...
If so, is there an alternative? One sweep through the function will --using Sam's KD tree-- find the positions on every grid that I want to mark. The only reason to put it in a function, as opposed to the main body, is so I can then use yt's slice and projection tools on the data set.

Yes, you should do this then. The problem will be taking the values and turning them back into grid values, as they'll be vertex-centered and in partitioned grids. Sam can likely describe how to do that.

...
I'd really like to create a new field I can use with add_slice that only is looped over a single time. I appreciate this line:

for n in range(nflaggedcells): new_data[[data["x"] == xpos and data["y"] == ypos and data["z"] == zpos]] = 1.0

then won't work, but I can just store a list of the grids and cell numbers and mark up new_data using those instead.

Hm, slices also have "x","y","z" values. If you don't mandate ValidateSpatial, accessing a slice field should only read the field dependencies from disk (as slices) and then calcualte. For instance, if you ask for H2I_Fraction from a slice, it will only read H2I_Density and Density in that slice (not in each grid, but in the appropriate subselection of a grid that intersects a slice) and then do the derived field function on the field_data values, not on the grid values.

-Matt

...
Does that make sense?

Thanks!

Elizabeth _______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

Elizabeth Tasker

2:29 p.m.

...

...
But basically, I can't create a function that I can plug into slice or projection without it running through the function a gazillion times?

You definitely can for slices. If you define a derived field that is not ValidateSpatial, it will only be called once, on the slice. The flow looks like this: inside get_data, for every field requested (in this case a derived field), it checks to see if it can generate that field. If it can (i.e., it's not a grid-requiring or spatial-requiring field) it will do so. If the field exists in the grid already, it will not be regenerated.

Unfortunately projections do require touching every grid.

Hmm, OK I shall go away and think about this. Perhaps I could calculate my data set just once and then write a routine that simply checks which values are set on each grid, rather than perform the actual calculation.... Thanks! Elizabeth

4546

Age (days ago)

4546

Last active (days ago)

List overview

Download

4 comments

3 participants

участники (3)

Elizabeth Tasker
Elizabeth Tasker
Matthew Turk

basic question

Elizabeth Tasker

Matthew Turk

Elizabeth Tasker

Matthew Turk

Elizabeth Tasker

tags

участники (3)