[Matplotlib-devel] Units discussion...

David Stansby dstansby at gmail.com
Thu Feb 8 17:11:53 EST 2018


I agree with everything you've said there.

I propose to have a go at implementing what I proposed in the next few
weeks - on the surface it seems to me like it will simplify things a lot,
but I guess I'll see as I go how hard it actually is! If it works it will
be a bit of an upheaval for 3rd parties who use units at the moment, but
should be worth it in the long run.

David

On 8 February 2018 at 21:47, Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov> wrote:

> FYI for anyone interested - we already submitted (around the time of the
> first unit submit code in 2009) a mock of up our unit and time classes,
> with converters and tickers which is located in
> matplotlib/testing/jpl_units/.  It doesn't appear to be used in any tests
> anymore but it's there if anyone wants to look at it and was used in the
> original unit API testing.
>
> I think everyone isn't in as much disagreement as it appears.  The way MPL
> works right now, it's easy for dev's who aren't familiar with units to
> write code that works and appears correct, but fails for some cases like
> units.  And they won't know that until a user runs into that case.  So we
> should work to improve this situation.  The solution will most likely be
> some combination of code changes, clearer dev docs, and more and better
> test cases.
>
> I think a big problem is that the plots have no defined internal data
> representation.  Since Python is untyped, it's easy to write code that
> works for one test case but fails others you might not think of.  It also
> means that inside a plot method, a developer really doesn't know what
> functions they're allowed to use.  Is the data variable a list?  Is it
> unitized?  Is it integers? floats? a numpy array?
>
> That's why I'd propose that for any numeric data type, the unit converter
> must return a numpy array of floats.  Then the plot code (and dev docs) can
> be very explicit about what functionality can be used and you can be sure
> that after the external->internal converter is run, you know what the data
> type is.  If done properly, I think this actually makes the existing code
> simpler.  We can have a sequence of converters that try to run on the input
> which would include "standard" types like lists and integers.  So if a user
> puts in a Python list of integers, floats, numpy, or their own type, etc,
> the developer knows that once the converter at the top of the method runs,
> they have a numpy array of floats to work with and there is no guess as to
> what functions will work or not work.
>
> If this works, then it can be "the one way" to write a plot function for
> numeric data and every method can have the conversion as the first step.
>
> Ted
> ps: I think this dev list is the best forum for this discussion unless you
> can arrange a conference where we can all meet up.  I find gitter is too
> hard to follow unless you're watching it in real time.  A forum thread
> would be better IMO, but we don't have that.
>
> ________________________________________
> From: Nathan Goldbaum <nathan12343 at gmail.com>
> Sent: Thursday, February 8, 2018 12:13 PM
> To: Drain, Theodore R (392P)
> Cc: matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> On Thu, Feb 8, 2018 at 1:08 PM, Drain, Theodore R (392P) <
> theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>
> wrote:
> Does numpy subclassing really matter?  If the docs say the unit converter
> must convert from the external type to the internal type, then as long as
> the converter does that, it doesn't matter what the external type is or
> what it inherits from right?  The point is that the converter class is the
> only class manipulating the external data objects - MPL shouldn't care what
> they are or what they inherit from.
>
> To make my statement more concrete, here's a matplotlib pull request that
> fixed a bug that only triggered for astropy and yt but not for pint:
>
> https://github.com/matplotlib/matplotlib/pull/6622
>
> In this case it was an issue because of difference in how NumPy's masked
> array deals with ndarray subclasses versus array wrapper classes.
>
> I think one issue is that data types are malleable in the API right now.
> Lists, tuples, numpy, ints, floats, etc are all possible inputs in
> many/most cases.  IMO, the unit API should not be malleable at all.  The
> unit converter API should say that the return type of external->internal
> conversion is always a specific value type (e.g. list of float, numpy float
> 64 array).
>
> Jody: IMO, your example should plot the data in inches in the first plot
> call, then convert the second input to inches and plot that.  The plot
> calls supports the xunits keyword argument which tells the converter what
> floating point unit conversion to apply.  If that keyword is not specified,
> then it defaults to the type of the input.  The example that needs to be
> more clear is if I do this:
>
> ax.plot( x1, y1, xunits="km" )
> ax.plot( x2, y2, xunits="miles" )
>
> IMO, either the floats are km or miles, not both.  So either the first
> call sticks the converter to using km and the second xunits is ignored.  Or
> the second input overrides the first and requires that the first artists go
> back through a conversion to miles.  Either is a reasonable choice for
> behavior (but the first is much easier to implement).
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/ed036a1b/attachment-0001.html>


More information about the Matplotlib-devel mailing list