Something that's come up for me a lot lately is hearing from people who
have datasets that are mostly recognized by yt, but have a few extra fields
that aren't normally present. I think that in some cases, these can be read
in, but are returned as dimensionless. Even then, this produces a number of
cumbersome side-effects for using them in analysis that I won't bother to
get into unless someone is interested.
Unless I'm mistaken, the only current solution to this is to modify the
known_other_fields and known_particle_fields tuples in that frontend's
fields.py file. I think there are a lot of benefits to creating a means for
users to "declare" additional on-disk fields for a dataset.
Here are some of the big ones, for me at least:
1. There are certain simulation codes with a large number of slight
variations. Keeping up with them all is probably unfeasible, especially
when some of these will have relatively short life spans.
2. The current solution requires either someone to maintain a separate fork
(or uncommitted changes!) or intervention by experienced developers plus
the delay time associated with propagating changes upstream.
3. Perhaps such a system could make it easier for long-lasting variants to
ultimately become officially supported by providing the user a simpler way
to debug issues relating to new fields (like units, proper aliases, etc.).
In fairness, some potential cons:
1. This gives people an excuse to not add official support for new variants
and results in a separate ecosystem of frontends only supported through
2. It could be quite difficult to do for some datasets.
As for how to do this, my impression is that something needs to be provided
at load time before the field_list is assembled, something like handing
yt.load some sort of FieldDeclaration object or a series of dicts/tuples to
pass into the FieldInfoContainer.
Anyway, I think this could be very helpful to a lot of people, but I'm very
interested to hear any thoughts on whether this is a good idea and how we
might do it.