[Numpy-discussion] Accepting NEP 42 — New and extensible DTypes
Sebastian Berg
sebastian at sipsolutions.net
Tue Oct 27 13:06:13 EDT 2020
Hi all,
On Thu, 2020-10-08 at 07:51 -0500, Sebastian Berg wrote:
> Hi all,
>
> after another thorough revision of NEP 42 (much thanks to Ben!), I
> propose accepting the NEP, with the note that details are expected
> change.
>
> I am always happy to clarify and review the document based on
> feedback,
> but I feel the important technical points should be very clear and
> settled.
> Exposing all of the proposed API may need additional detailed API
> discussion. My focus is still a bit on the big picture design choices
> that the NEP makes need to move forward and settle the implementation
> internal to NumPy, although I am happy to discuss the details!
>
> The title of the NEP is:
>
> NEP 42 — New and extensible DTypes
>
This has been a while ago, and a draft for NEP 43 (UFunc redesign) is
now available at:
https://numpy.org/neps/nep-0043-extensible-ufuncs.html
I would appreciate any feedback and am happy to go into more details
where necessary. Do we have a consensus about the general big picture
API design or are there any concerns?
These documents outline (most importantly):
1. How DTypes should be created (NEP 42)
2. How Casting will be implemented (NEP 42)
3. How UFuncs will be redesigned: (NEP 43)
* This changes the calling convention
* It also unifies casting largely with ufuncs
4. How ufunc promotion will be handled in the future: (NEP 43)
* This is what happens when you add mixed types, for
example float64 + int32 casts int32 to float64 and
uses the float64 + float64 implementation.
Point 1. is finished to the extend currently necessary.
Right now I am basically finishing with Casting (point 2). And I expect
it to move forward very soon at least in part.
This does have a big overlap with UFuncs (point 3), though. So if you
are interested in that, it is a good time to dive in, even if many
details can still be changed easily for a while!
Cheers,
Sebastian
> And available at:
>
> https://numpy.org/neps/nep-0042-new-dtypes.html
>
> While enabling new user-defined DTypes is the main goal, the main
> work
> is the internal restructure of NumPy's own DTypes necessary to allow
> that.
>
> I have pasted the "Abstract" and "Motivation and scope" section
> below,
> which give a good overview of the issues and we are trying to
> address.
> It is followed by the "Usage and impact" section which gives a big-
> picture overview of the design.
> I will refer to the full NEP for more detailed technical decisions
> and
> explanations.
>
> Cheers,
>
> Sebastian
>
>
> PS: In some places NEP 42 references NEP 43, for which I hope to
> merge
> the draft soon, the current status is here:
>
> https://github.com/numpy/numpy/pull/16723
>
> However, this should be mainly interested for those wishing to go
> into
> more technical details.
>
>
>
>
> *********************************************************************
> **
> *******
> Abstract
> *********************************************************************
> **
> *******
>
> NumPy's dtype architecture is monolithic -- each dtype is an instance
> of a
> single class. There's no principled way to expand it for new dtypes,
> and the
> code is difficult to read and maintain.
>
> As :ref:`NEP 41 <NEP41>` explains, we are proposing a new
> architecture
> that is
> modular and open to user additions. dtypes will derive from a new
> ``DType``
> class serving as the extension point for new types.
> ``np.dtype("float64")``
> will return an instance of a ``Float64`` class, a subclass of root
> class
> ``np.dtype``.
>
> This NEP is one of two that lay out the design and API of this new
> architecture. This NEP addresses dtype implementation; NEP 43
> addresses
> universal functions.
>
> .. note::
>
> Details of the private and external APIs may change to reflect
> user
> comments and implementation constraints. The underlying
> principles
> and
> choices should not change significantly.
>
>
> *********************************************************************
> **
> *******
> Motivation and scope
> *********************************************************************
> **
> *******
>
> Our goal is to allow user code to create fully featured dtypes for a
> broad
> variety of uses, from physical units (such as meters) to domain-
> specific
> representations of geometric objects. :ref:`NEP 41 <NEP41>` describes
> a
> number
> of these new dtypes and their benefits.
>
> Any design supporting dtypes must consider:
>
> - How shape and dtype are determined when an array is created
> - How array elements are stored and accessed
> - The rules for casting dtypes to other dtypes
>
> In addition:
>
> - We want dtypes to comprise a class hierarchy open to new types and
> to
> subhierarchies, as motivated in :ref:`NEP 41 <NEP41>`.
>
> And to provide this,
>
> - We need to define a user API.
>
> All these are the subjects of this NEP.
>
> - The class hierarchy, its relation to the Python scalar types, and
> its
> important attributes are described in `nep42_DType class`_.
>
> - The functionality that will support dtype casting is described in
> `Casting`_.
>
> - The implementation of item access and storage, and the way shape
> and
> dtype
> are determined when creating an array, are described in
> :ref:`nep42_array_coercion`.
>
> - The functionality for users to define their own DTypes is described
> in
> `Public C-API`_.
>
> The API here and in NEP 43 is entirely on the C side. A Python-side
> version
> will be proposed in a future NEP. A future Python API is expected to
> be
> similar, but provide a more convenient API to reuse the functionality
> of
> existing DTypes. It could also provide shorthands to create
> structured
> DTypes
> similar to Python's
> `dataclasses <https://docs.python.org/3.8/library/dataclasses.html>`_
> .
>
>
> *********************************************************************
> **
> *******
> Usage and impact
> *********************************************************************
> **
> *******
>
> We believe the few structures in this section are sufficient to
> consolidate
> NumPy's present functionality and also to support complex user-
> defined
> DTypes.
>
> The rest of the NEP fills in details and provides support for the
> claim.
>
> Again, though Python is used for illustration, the implementation is
> a
> C API only; a
> future NEP will tackle the Python API.
>
> After implementing this NEP, creating a DType will be possible by
> implementing
> the following outlined DType base class,
> that is further described in `nep42_DType class`_:
>
> class DType(np.dtype):
> type : type # Python scalar type
> parametric : bool # (may be indicated by superclass)
>
> @property
> def canonical(self) -> bool:
> raise NotImplementedError
>
> def ensure_canonical(self : DType) -> DType:
> raise NotImplementedError
>
> For casting, a large part of the functionality is provided by the
> "methods" stored
> in ``_castingimpl``
>
> @classmethod
> def common_dtype(cls : DTypeMeta, other : DTypeMeta) ->
> DTypeMeta:
> raise NotImplementedError
>
> def common_instance(self : DType, other : DType) -> DType:
> raise NotImplementedError
>
> # A mapping of "methods" each detailing how to cast to
> another
> DType
> # (further specified at the end of the section)
> _castingimpl = {}
>
> For array-coercion, also part of casting:
>
> def __dtype_setitem__(self, item_pointer, value):
> raise NotImplementedError
>
> def __dtype_getitem__(self, item_pointer, base_obj) ->
> object:
> raise NotImplementedError
>
> @classmethod
> def __discover_descr_from_pyobject__(cls, obj : object) ->
> DType:
> raise NotImplementedError
>
> # initially private:
> @classmethod
> def _known_scalar_type(cls, obj : object) -> bool:
> raise NotImplementedError
>
>
> Other elements of the casting implementation is the ``CastingImpl``:
>
> casting = Union["safe", "same_kind", "unsafe"]
>
> class CastingImpl:
> # Object describing and performing the cast
> casting : casting
>
> def resolve_descriptors(self, Tuple[DType] : input) ->
> (casting, Tuple[DType]):
> raise NotImplementedError
>
> # initially private:
> def _get_loop(...) -> lowlevel_C_loop:
> raise NotImplementedError
>
> which describes the casting from one DType to another. In
> NEP 43 this ``CastingImpl`` object is used unchanged to
> support universal functions.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201027/a5418d23/attachment.sig>
More information about the NumPy-Discussion
mailing list