[Matplotlib-devel] Units discussion...
Antony Lee
antony.lee at berkeley.edu
Thu Feb 8 12:37:15 EST 2018
The problem here (as you mentioned) is that essentially close to everything
is a public API in Matplotlib, and I believe quite strongly that it is
unreasonable to make every function check for unitized data (and what about
attributes? are bboxes and transforms supposed to handle units too?). For
example, this leads to Line2D.get_data have the orig=True[False kwarg and
the class needs to internally keep both unitized and deunitized data
around; duplicating this support throughout all artists would be a *lot* of
code.
While Axes methods can reasonably support units out of the box, I think it
is more reasonable to have (up to bikeshedding) `Axes.unitize` and
`Axes.deunitize` and then have people who need to play with the artists
themselves do e.g. `artist.set_data(artist.axes.deunitize(unitized_data))`
and `artist.axes.unitize(artist.get_data())`. Yes, I realize this may be
more work for you, but it's also a tradeoff of less work for us :-) With
this design, another possibility (which I guess Tom is not going to like,
but I actually think is reasonable) would be for you to patch all the
Artist classes yourself to support unitized data in all methods you want
(using the proper wrapper methods).
The "moving target" part is basically that there has never been complete
support for units everywhere in the code base, and because things are added
in a piecemeal fashion rather than with a well thought-out design, I'm a
bit tired of the constant stream of "oh, this function doesn't support
datetimes, we need to fix it". Again, I believe rethinking the design in a
comprehensive fashion would help with that.
Antony
2018-02-08 18:13 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov>:
> Sorry - that's not what I meant. The unit conversions API that's in place
> works fine I can't think of a better way to describe the use cases than
> the basic ones that seem (at least to me) to be obvious. Numbers with
> units (5*km) and time classes (datetime or some other time class like we
> use) are the primary use case. Another way to say it is that users have
> data where the normal representation is not float and they want to plot it,
> control how the transformation to float is done (plot in km or miles, in
> UTC or GPS time) and manipulate the plot after it's plotted (get bounds,
> change bounds, change units, move artists, edit data, etc) in the non-float
> representation that their data is already in.
>
> I realize that units are "a pain", but they're hugely useful. Just
> plotting datetimes is going to be a pain without units (and was a huge pain
> before the unit system). The proposal that only Axes supports units is
> going to cause us a massive problem as that's rarely everything that we do
> with a plot. I could do a survey to find all the interactions we use (and
> that doesn't even touch the 1000's of lines of code our users have written)
> if that would help but anything that's part of the public api (axes,
> artists, patches, etc) is probably being used - i.e. pretty much anything
> that's in the current user's guide is something that we use/want/need to
> work with unitized data.
>
> This is kind of what I meant in my previous email about use cases. Saying
> "just Axes has units" is basically saying the only valid unit use case is
> create a plot one time and look at it. You can't manipulate it, edit it,
> or build any kind of plotting GUI application (which we have many of) once
> the plot has been created. The Artist classes are one of the primary API's
> for applications. Artists are created, edited, and manipulated if you want
> to allow the user to modify things in a plot after it's created. Even
> the most basic cases like calling Line2D.set_data() wouldn't be allowed
> with units if only Axes has unit support.
>
> I'm not sure I understand the statement that units are a moving target.
> The reason it keeps popping up is that code gets added without something
> considering units which then triggers a bug reports which require fixing.
> If there was a clearer policy and new code was required to have test cases
> that cover non-unit and unit inputs, I think things would go much
> smoother. We'd be happy to help with submitting new test cases to cover
> unit cases in existing code once a policy is decided on. Maybe what's
> needed is better documentation for developers who don't use units so they
> can easily write a test case with units when adding/modifying functionality.
>
> Ted
>
> ________________________________________
> From: anntzer.lee at gmail.com <anntzer.lee at gmail.com> on behalf of Antony
> Lee <antony.lee at berkeley.edu>
> Sent: Thursday, February 8, 2018 8:09 AM
> To: Drain, Theodore R (392P)
> Cc: matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> I'm momentarily a bit away from Matplotlib development due to real life
> piling up, so I'll just keep this short.
>
> One major point (already mentioned by others) that led, I think, to some
> devs (including myself) being relatively dismissive about unit support is
> the lack of well-defined use case, other than "it'd be nice if we supported
> units" (i.e., especially from the point of view of devs who *don't* use
> units themselves, it ends up being an ever moving target). In particular,
> tests on unit support ("unit unit tests"? :-)) currently only rely on the
> old JPL unit code that ended up integrated into Matplotlib's test suite,
> but does not test integration with the two major unit packages I am aware
> of (pint and astropy.units).
>
> From the email of Ted it appears that these are not sufficient to
> represent all kinds of relevant units. In particular, I was at some point
> hoping to completely work in deunitized data internally, *including the
> plotting*, and rely on the fact that if the deunitized and the unitized
> data are usually linked by an affine transform, so the plotting part
> doesn't need to convert back to unitized data and we only need to place and
> label the ticks accordingly; however Ted mentioned relativistic units,
> which imply the use of a non-affine transform. So I think it would also be
> really helpful if JPL could release some reasonably documented unit library
> with their actual use cases (and how it differs from pint & astropy.units),
> so that we know better what is actually needed (I believe carrying the JPL
> unit code in our own code base is a mistake).
>
> As for the public vs private, or rather unitized vs deunitized API
> discussion, I believe a relatively simple and consistent line would be to
> make Axes methods unitized and everything else deunitized (but with clear
> ways to convert to and from unitized data when not using Axes methods).
>
> Antony
>
> 2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
> theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>:
> That sounds fine to me. Our original unit prototype API actually had
> conversions for both directions but I think the float->unit version was
> removed (or really moved) when the ticker/formatter portion of the unit API
> was settled on.
>
> Using floats/numpy arrays internally is going to easier and faster so I
> think that's a plus. The biggest issue we're going to run in to is what's
> defined as "internal" vs part of the unit API. Some things are easy like
> the Axes/Axis API. But we also use low level API's like the patches. Are
> those unitized? This is the pro and con of using something like Python
> where basically everything is public. It makes it possible to do lots of
> things, but it's much harder to define a clear library with a specific
> public API.
>
> Somewhere in the process we should write a proposal that outlines which
> classes/methods are part of the unit api and which are going to be
> considered internal. I'm sure we can help with that effort.
>
> That also might help clarify/influence code structure - if internal
> implementation classes are placed in a sub-package inside MPL 3.0, it
> becomes clearer to people later on what the "official' public API vs what
> can be optimized to just use floats. Obviously the dev's would need to
> decide if that kind of restructuring is worth it or not.
>
> Ted
>
> ________________________________________
> From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com>>
> Sent: Wednesday, February 7, 2018 3:42 AM
> To: Jody Klymak
> Cc: Drain, Theodore R (392P); matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> Practically, I think what we are proposing is that for unit support the
> user must supply two functions for each axis:
>
> * A mapping from your unit objects to floating point numbers
> * A mapping from those floats back to your unit objects
>
> As far as I know function 2 is new, and doesn't need to be supplied at the
> moment. Doing this would mean we can convert units as soon as they enter
> Matplotlib, only ever have to deal with floating point numbers internally,
> and then use the second function as late as possible when the user requests
> stuff like e.g. the axis limits.
>
> Also worth noting that any major change like this will go in to Matplotlib
> 3.0 at the earliest, so will be python 3 only.
>
> David
>
> On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklyma
> k at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
> Dear Ted,
>
> Thanks so much for engaging on this.
>
> Don’t worry, nothing at all is changing w/o substantial back and forth,
> and OK from downstream users. I actually don’t think it’ll be a huge
> change, probably just some clean up and better documentation.
>
> FWIW, I’ve not personally done much programming w/ units, just been a bit
> perplexed by their inconsistent and (to my simple mind) convoluted
> application in the codebase. Having experience from people who try to use
> them everyday will be absolutely key.
>
> Cheers, Jody
>
> > On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <
> theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov
> ><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theo
> dore.r.drain at jpl.nasa.gov>>> wrote:
> >
> > We use units for everything in our system (in fact, we funded John
> Hunter originally to add in a unit system so we could use MPL) so it's a
> crucial system for us. In our system, we have our own time classes (which
> handle relativistic time frames as well as much higher precision
> representations) and a custom unit system for floating point values.
> >
> > I think it's important to talk about these changes in concrete terms. I
> understand the words you're using, but I'm not really clear on what the
> real proposed changes are. For example, the current unit API returns a
> units.AxisInfo object so the converter can set the formatter and locators
> to use. Is that what you mean in the 2nd paragraph about ticks and
> labels? Or is that changing?
> >
> > The current unit api is pretty simple and in units.ConversionInterface.
> Are any of these changes going to change the conversion API? (note - I'm
> not against changing it - I'm just not sure if there are any changes or
> not).
> >
> > Another thing to consider: many of the examples people use are scripts
> which make a plot and stop. But there are other use cases which are more
> complicated and stress the system in different ways. We write several GUI
> applications (in PyQt) that use MPL for plotting. In these cases, the user
> is interacting with the plot to add and remove artists, change styles,
> modify data, etc etc. So having a good object oriented API for modifying
> things after construction is important for this to work. So when units are
> involved, it can't be a "convert once at construction" and never touch
> units again. We are constantly adjusting limits, moving artists, etc in
> unitized space after the plot is created.
> >
> > So in addition to the ConversionInterface API, I think there are other
> items that would be useful to explicitly spelled out. Things like which
> API's in MPL should accept units and which won't and which methods return
> unitized data and which don't. It would be nice if there was a clear
> policy on this. Maybe one exists and I'm not aware of it - it would be
> helpful to repeat it in a discussion on changing the unit system.
> Obviously I would love to have every method accept and return unitized data
> :-).
> >
> > I bring this up because I was just working on a hover/annotation class
> that needed to move a single annotation artist with the mouse. To move the
> annotation box the way I needed to, I had to set to one private member
> variable, call two set methods, use attribute assignment for one value, and
> set one semi-public member variable - some of which work with units and
> some of which didn't. I think having a clear "this kind of method
> accepts/returns units" policy would help when people are adding new
> accessors/methods/variables to make it more clear what kind of data is
> acceptable in each.
> >
> > Ted
> > ps: I may be able to help with some resources to work on any unit
> upgrades, but to make that happen I need to get a clear statement of what
> problem is being solved and the scope of the work so I can explain to our
> management why it's important.
> >
> > ________________________________________
> > From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=
> jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org><mailto:
> jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of
> Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak@
> uvic.ca<mailto:jklymak at uvic.ca>>>
> > Sent: Saturday, February 3, 2018 9:25 PM
> > To: matplotlib development list
> > Subject: [Matplotlib-devel] Units discussion...
> >
> > Hi all,
> >
> > To carry on the gitter discussion about unit handling, hopefully to lead
> to a more stringent documentation and implimentation….
> >
> > In response to @anntzer I thought about the units support a bit - it
> seems that rather than a transform, a more straightforward approach is to
> have the converter map to float arrays in a unique way. This float mapping
> would be completely analogous to `date2num` in `dates`, in that it doesn’t
> change and is perfectly invertible without matplotlib ever knowing about
> the unit information, though the axis could store it for the the tick
> locators and formatters. It would also have an inverse that would supply
> data back to the user in unit-aware data (though not necessarily in the
> unit that the user supplied. e.g. if they supply 8*in, the and the
> converter converts everything to meter floats, then the returned unitized
> inverse would be 0.203*m, or whatever convention the converter wants to
> supply.).
> >
> > User “unit” control, i.e. making the plot in inches instead of m, would
> be accomplished with ticks locators and formatters. Matplotlib would never
> directly convert between cm and inches (any more than it converts from days
> to hours for dates), the downstream-supplied tick formatter and labeller
> would do it.
> >
> > Each axis would only get one converter, set by the first call to the
> axis. Subsequent calls to the axis would pass all data (including bare
> floats) to the converter. If the converter wants to pass bare floats then
> it can do so. If it wants to accept other data types then it can do so.
> It should be possible for the user to clear or set the converter, but then
> they should know what they are doing and why.
> >
> > Whats missing? I don’t think this is wildly different than what we
> have, but maybe a bit more clear.
> >
> > Cheers, Jody
> >
> >
> >
> >
> > _______________________________________________
> > Matplotlib-devel mailing list
> > Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> > https://mail.python.org/mailman/listinfo/matplotlib-devel
> > _______________________________________________
> > Matplotlib-devel mailing list
> > Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> > https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/3e457026/attachment-0001.html>
More information about the Matplotlib-devel
mailing list