[Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray
Chris Barker
chris.barker at noaa.gov
Sat Mar 10 17:39:40 EST 2018
On Sat, Mar 10, 2018 at 1:27 PM, Matthew Rocklin <mrocklin at gmail.com> wrote:
> I'm very glad to see this discussion.
>
me too, but....
> I think that coming up with a single definition of array-like may be
> difficult, and that we might end up wanting to embrace duck typing instead.
>
exactly -- I think there is a clear line between "uses the numpy memory
layout" and the Python API. But the python API is pretty darn big, and many
"array_ish" classes implement only partvof it, and may even implement some
parts a bit differently. So really hard to have "one" definition, except
"Python API exactly like a ndarray" -- and I'm wondering how useful that is.
It seems to me that different array-like classes will implement different
> mixtures of features. It may be difficult to pin down a single definition
> that includes anything except for the most basic attributes (shape and
> dtype?).
>
or a minimum set -- but again, how useful??
> Storage objects like h5py (support getitem in a numpy-like way)
>
Exactly -- though I don't know about h5py, but netCDF4 variables supoprt a
useful subst of ndarray, but do "fancy indexing" differently -- so are they
ndarray_ish? -- sorry to coin yet another term :-)
> I can imagine authors of both groups saying that they should qualify as
> array-like because downstream projects that consume them should not convert
> them to numpy arrays in important contexts.
>
indeed. My solution so far is to define my own duck types "asarraylike"
that checks for the actual methods I need:
https://github.com/NOAA-ORR-ERD/gridded/blob/master/gridded/utilities.py
which has:
must_have = ['dtype', 'shape', 'ndim', '__len__', '__getitem__', '
__getattribute__']
def isarraylike(obj):
"""
tests if obj acts enough like an array to be used in gridded.
This should catch netCDF4 variables and numpy arrays, at least, etc.
Note: these won't check if the attributes required actually work right.
"""
for attr in must_have:
if not hasattr(obj, attr):
return False
return True
def asarraylike(obj):
"""
If it satisfies the requirements of pyugrid the object is returned as is.
If not, then numpy's array() will be called on it.
:param obj: The object to check if it's like an array
"""
return obj if isarraylike(obj) else np.array(obj)
It's possible that we could come up with semi-standard "groupings" of
attributes to produce "levels" of compatibility, or maybe not levels, but
independentgroupings, so you could specify which groupings you need in this
instance.
> The name "duck arrays" that we sometimes use doesn't necessarily mean
> "quack like an ndarray" but might actually mean a number of different
> things in different contexts. Making a single class or predicate for duck
> arrays may not be as effective as we want. Instead, it might be that we
> need a number of different protocols like `__array_mat_vec__` or `__array_slice__`
> that downstream projects can check instead. I can imagine cases where I
> want to check only "can I use this thing to multiply against arrays" or
> "can I get numpy arrays out of this thing with numpy slicing" rather than
> "is this thing array-like" because I may genuinely not care about most of
> the functionality in a blessed definition of "array-like".
>
exactly.
but maybe we won't know until we try.
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180310/4a8c8312/attachment.html>
More information about the NumPy-Discussion
mailing list