[yt-dev] Re: seeking input: supporting unrecognized fields on disk

1 Feb 2019


      I am a big proponent of this fix, as someone who has dealt with lots of
different codebases and a seemingly constantly changing set of field
definitions (e.g., Gadget).  Nathan's ideas sound pretty interesting.  Then
one could just keep a config file with certain datasets, which would make
this solution pretty modular, and easy for new users using someone's old
dataset.  Good ideas!

Cameron

On Fri, Feb 1, 2019 at 1:37 PM Nathan Goldbaum 
wrote:
...
Hi Britton,
I have long-term plans to fix this as part of a general reworking of the
field system. My idea is to replace known_other_fields and
known_particle_fields with a declarative file in json or toml format
listing the known fields for that frontend. Then, for people who wish to
override the default behavior, they could supply their own configuration
file.
We had some discussion about this back in late 2016, but unfortunately
that didn't go anywhere.
It sounds like you're motivated to get a solution implemented though. If
you'd like to help out with pushing this effort forward I'd be happy to
discuss further.
-Nathan
On Fri, Feb 1, 2019 at 3:31 PM Britton Smith 
wrote:
...
Hi everyone,
Something that's come up for me a lot lately is hearing from people who
have datasets that are mostly recognized by yt, but have a few extra fields
that aren't normally present. I think that in some cases, these can be read
in, but are returned as dimensionless. Even then, this produces a number of
cumbersome side-effects for using them in analysis that I won't bother to
get into unless someone is interested.
Unless I'm mistaken, the only current solution to this is to modify the
known_other_fields and known_particle_fields tuples in that frontend's
fields.py file. I think there are a lot of benefits to creating a means for
users to "declare" additional on-disk fields for a dataset.
Here are some of the big ones, for me at least:
1. There are certain simulation codes with a large number of slight
variations. Keeping up with them all is probably unfeasible, especially
when some of these will have relatively short life spans.
2. The current solution requires either someone to maintain a separate
fork (or uncommitted changes!) or intervention by experienced developers
plus the delay time associated with propagating changes upstream.
3. Perhaps such a system could make it easier for long-lasting variants
to ultimately become officially supported by providing the user a simpler
way to debug issues relating to new fields (like units, proper aliases,
etc.).
In fairness, some potential cons:
1. This gives people an excuse to not add official support for new
variants and results in a separate ecosystem of frontends only supported
through this method.
2. It could be quite difficult to do for some datasets.
As for how to do this, my impression is that something needs to be
provided at load time before the field_list is assembled, something like
handing yt.load some sort of FieldDeclaration object or a series of
dicts/tuples to pass into the FieldInfoContainer.
Anyway, I think this could be very helpful to a lot of people, but I'm
very interested to hear any thoughts on whether this is a good idea and how
we might do it.
Thanks!
Britton
_______________________________________________
yt-dev mailing list -- yt-dev@python.org
To unsubscribe send an email to yt-dev-leave@python.org
_______________________________________________
yt-dev mailing list -- yt-dev@python.org
To unsubscribe send an email to yt-dev-leave@python.org
-- 
Cameron Hummels
NSF Postdoctoral Fellow
Department of Astronomy
California Institute of Technology
http://chummels.org