On Tue, Nov 7, 2017 at 1:20 PM, Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Nov 6, 2017 at 4:28 PM, Stephan Hoyer <shoyer@gmail.com> wrote:

What's needed, though, is not just a single ABC. Some thought and design needs to go into segmenting the ndarray API to declare certain behaviors, just like was done for collections:


You don't just have a single ABC declaring a collection, but rather "I am a mapping" or "I am a mutable sequence". It's more of a pain for developers to properly specify things, but this is not a bad thing to actually give code some thought. 

I agree, it would be nice to nail down a hierarchy of duck-arrays, if possible. Although, there are quite a few options, so I don't know how doable this is.

Exactly -- there are an exponential amount of options...
 
Well, to get the ball rolling a bit, the key thing that matplotlib needs to know is if `shape`, `reshape`, 'size', broadcasting, and logical indexing is respected. So, I see three possible abc's here: one for attribute access (things like `shape` and `size`) and another for shape manipulations (broadcasting and reshape, and assignment to .shape).

I think we're going to get into an string of ABCs:

ArrayLikeForMPL_ABC

etc, etc.....

Only if you try to provide perfectly-sized options for every occasion--but that's not how we do things in (sane) software development. You provide a few options that optimize the common use cases, and you don't try to cover everything--let client code figure out the right combination from the primitives you provide. One can always just inherit/register *all* the ABCs if need be. The status quo is that we have 1 interface that covers everything from multiple dims and shape to math and broadcasting to the entire __array__ interface. Even breaking that up into the 3 "obvious" chunks would be a massive improvement. 

I just don't want to see this effort bog down into "this is so hard". Getting it perfect is hard; getting it useful is much easier.

It's important to note that we can always break up/combine existing ABCs into other ones later.
 
And then a third abc for indexing support, although, I am not sure how that could get implemented...

This is the really tricky one -- all ABCs really check is the existence of methods -- making sure they behave the same way is up to the developer of the ducktype.

which is K, but will require discipline.

But indexing, specifically fancy indexing, is another matter -- I'm not sure if there even a way with an ABC to check for what types of indexing are support, but we'd still have the problem with whether the semantics are the same!

For example, I work with netcdf variable objects, which are partly duck-typed as ndarrays, but I think n-dimensional fancy indexing works differently... how in the world do you detect that with an ABC???

Even documenting expected behavior as part of these ABCs would go a long way towards helping standardize behavior.

Another idea would be to put together a conformance test suite as part of this effort, in lieu of some kind of run-time checking of behavior (which would be terrible). That would help developers of other "ducks" check that they're doing the right things. I'd imagine the existing NumPy test suite would largely cover this.

Ryan

--
Ryan May