[Matplotlib-devel] Units discussion...

Drain, Theodore R (392P) theodore.r.drain at jpl.nasa.gov
Thu Feb 8 12:13:53 EST 2018


Sorry - that's not what I meant.  The unit conversions API that's in place works fine  I can't think of a better way to describe the use cases than the basic ones that seem (at least to me) to be obvious.  Numbers with units (5*km) and time classes (datetime or some other time class like we use) are the primary use case.   Another way to say it is that users have data where the normal representation is not float and they want to plot it, control how the transformation to float is done (plot in km or miles, in UTC or GPS time) and manipulate the plot after it's plotted (get bounds, change bounds, change units, move artists, edit data, etc) in the non-float representation that their data is already in.

I realize that units are "a pain", but they're hugely useful.  Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system).  The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot.  I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.  

This is kind of what I meant in my previous email about use cases.  Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it.  You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created.  The Artist classes are one of the primary API's for applications.  Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created.    Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.  The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing.  If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother.  We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on.  Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.  

Ted

________________________________________
From: anntzer.lee at gmail.com <anntzer.lee at gmail.com> on behalf of Antony Lee <antony.lee at berkeley.edu>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target).  In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

>From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units.  In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform.  So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me.  Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus.  The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API.  Some things are easy like the Axes/Axis API.  But we also use low level API's like the patches.  Are those unitized?  This is the pro and con of using something like Python where basically everything is public.  It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal.  I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats.  Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  *   A mapping from your unit objects to floating point numbers
  *   A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don’t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users.   I actually don’t think it’ll be a huge change, probably just some clean up and better documentation.

FWIW, I’ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase.  Having experience from people who try to use them everyday will be absolutely key.

Cheers,   Jody

> On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:
>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us.  In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.  I understand the words you're using,  but I'm not really clear on what the real proposed changes are.  For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use.  Is that what you mean in the 2nd paragraph about ticks and labels?  Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.  Are any of these changes going to change the conversion API?  (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider:  many of the examples people use are scripts which make a plot and stop.  But there are other use cases which are more complicated and stress the system in different ways.  We write several GUI applications (in PyQt) that use MPL for plotting.  In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc.  So having a good object oriented API for modifying things after construction is important for this to work.  So when units are involved, it can't be a "convert once at construction" and never touch units again.   We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out.  Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't.   It would be nice if there was a clear policy on this.  Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system.  Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse.  To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't.  I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation….
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way.  This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn’t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters.  It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied.  e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User “unit” control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters.  Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter.  If the converter wants to pass bare floats then it can do so.  If it wants to accept other data types then it can do so.  It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing?  I don’t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers,   Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel



More information about the Matplotlib-devel mailing list