I think Matt is right on this. The piter function is powerful and simple. It gives you full control over parallel functionality and also just hands users each pf, which makes writing new analysis much easier. It may be counterproductive to continue to maintain the AnalysisTask functionality.
Sorry for being a bit out of touch the last couple days.
So this is in place for exactly the reasons Britton outlines -- it's
because in theory, the time series data is meant to have a flexible
set of return values. For instance, while John notes that the
*default* behavior is to cycle through the list multiple times, you
can also call "eval" on TimeSeries to make it operate much more in
sequence on a series of operators.
In practice, I think that this method is not ... as necessarily useful
as the more generic .piter() method. Initially I thought, well, let's
make it so that you can swap out a pf for a ts and still expect to get
similar or identical return values. In retrospect, this is overly
clever. Reading the source, I can't imagine anyone dealing with an
actual AnalysisTask, or anything like that, when the much more
convenient .piter() is available, with the storage keyword.
The reason that the list is of additional dimensionality is to ensure
that during the par_combine_object, the lists are concatenated
correctly and mapped back in the right order to the original items.
So, I guess what I've talked myself into proposing is that we strip
off a lot of the overly clever stuff, and reduce TimeSeries back to
being just a convenient way of addressing multiple objects, ditching
the AnalysisTask stuff and retaining .piter(). Once we have that, we
can also start adding on more interesting things, like inter-timestep
correlations and whatnot.
On Wed, Jul 11, 2012 at 8:48 AM, John ZuHone <firstname.lastname@example.org> wrote:
> Hi Britton,
> The place to look is the eval function for TimeSeriesData. On line 125 of
> yt/data_objects/time_series.py, store.result is initialized to a list and
> all return values are appended to that list on line 141. This looks to me
> to be handling tasks called with piter that have multiple return values. If
> you add something like:
> if len(store.result) == 1: store.result = store.result
> just outside of the "for task in tasks" loop, it produces the behavior that
> we discussed. However, now that I understand why this was done, I'm not so
> sure it should be changed. Having tasks dispatched with piter send back
> their return values in a list is a desired feature I think, and so I think
> that generality should be preserved. Perhaps there is a solution that could
> only affect getting parameters through params, but I think we should let
> Matt chime in on this, since he is the most knowledgeable about this area of
> the code.
> However, this also seems to affect things not in params in an undesirable
> way. For example:
> from yt.mods import *
> all_files = glob.glob("*/*.hierarchy")
> ts = TimeSeries.from_filenames(all_files)
> sphere = ts.sphere("max", (1.0, "pc"))
> L_vecs = sphere.quantities["AngularMomentumVector"]()
> L_vecs gets returned as something like:
> Where once again you have lists of one object, namely in this case the NumPy
> arrays which are the angular momentum vector. So, generically speaking you
> are always getting an extra list in there you don't need, it seems.
> yt-dev mailing list
yt-dev mailing list