[Numpy-discussion] Accepting NEP 42 — New and extensible DTypes

Sebastian Berg sebastian at sipsolutions.net
Tue Oct 27 13:06:13 EDT 2020


Hi all,


On Thu, 2020-10-08 at 07:51 -0500, Sebastian Berg wrote:
> Hi all,
> 
> after another thorough revision of NEP 42 (much thanks to Ben!), I
> propose accepting the NEP, with the note that details are expected
> change.
> 
> I am always happy to clarify and review the document based on
> feedback,
> but I feel the important technical points should be very clear and
> settled.
> Exposing all of the proposed API may need additional detailed API
> discussion. My focus is still a bit on the big picture design choices
> that the NEP makes need to move forward and settle the implementation
> internal to NumPy, although I am happy to discuss the details!
> 
> The title of the NEP is:
> 
>      NEP 42 — New and extensible DTypes
> 

This has been a while ago, and a draft for NEP 43 (UFunc redesign) is
now available at:

    https://numpy.org/neps/nep-0043-extensible-ufuncs.html

I would appreciate any feedback and am happy to go into more details
where necessary. Do we have a consensus about the general big picture
API design or are there any concerns?

These documents outline (most importantly):

1. How DTypes should be created (NEP 42)
2. How Casting will be implemented (NEP 42)
3. How UFuncs will be redesigned:  (NEP 43)
   * This changes the calling convention
   * It also unifies casting largely with ufuncs
4. How ufunc promotion will be handled in the future: (NEP 43)
   * This is what happens when you add mixed types, for
     example float64 + int32 casts int32 to float64 and
     uses the float64 + float64 implementation.

Point 1. is finished to the extend currently necessary.

Right now I am basically finishing with Casting (point 2). And I expect
it to move forward very soon at least in part.
This does have a big overlap with UFuncs (point 3), though. So if you
are interested in that, it is a good time to dive in, even if many
details can still be changed easily for a while!

Cheers,

Sebastian 


> And available at:
> 
>      https://numpy.org/neps/nep-0042-new-dtypes.html
> 
> While enabling new user-defined DTypes is the main goal, the main
> work
> is the internal restructure of NumPy's own DTypes necessary to allow
> that.
> 
> I have pasted the "Abstract" and "Motivation and scope" section
> below,
> which give a good overview of the issues and we are trying to
> address.
> It is followed by the "Usage and impact" section which gives a big-
> picture overview of the design.
> I will refer to the full NEP for more detailed technical decisions
> and
> explanations.
> 
> Cheers,
> 
> Sebastian
> 
> 
> PS: In some places NEP 42 references NEP 43, for which I hope to
> merge
> the draft soon, the current status is here:
> 
>      https://github.com/numpy/numpy/pull/16723
> 
> However, this should be mainly interested for those wishing to go
> into
> more technical details.
> 
> 
> 
> 
> *********************************************************************
> **
> *******
> Abstract
> *********************************************************************
> **
> *******
> 
> NumPy's dtype architecture is monolithic -- each dtype is an instance
> of  a
> single class. There's no principled way to expand it for new dtypes,
> and the
> code is difficult to read and maintain.
> 
> As :ref:`NEP 41 <NEP41>` explains, we are proposing a new
> architecture
> that is
> modular and open to user additions. dtypes will derive from a new
> ``DType``
> class serving as the extension point for new types.
> ``np.dtype("float64")``
> will return an instance of a ``Float64`` class, a subclass of root
> class
> ``np.dtype``.
> 
> This NEP is one of two that lay out the design and API of this new
> architecture. This NEP addresses dtype implementation; NEP 43
> addresses
> universal functions.
> 
> .. note::
> 
>     Details of the private and external APIs may change to reflect
> user
>     comments and implementation constraints. The underlying
> principles
> and
>     choices should not change significantly.
> 
> 
> *********************************************************************
> **
> *******
> Motivation and scope
> *********************************************************************
> **
> *******
> 
> Our goal is to allow user code to create fully featured dtypes for a
> broad
> variety of uses, from physical units (such as meters) to domain-
> specific
> representations of geometric objects. :ref:`NEP 41 <NEP41>` describes
> a
> number
> of these new dtypes and their benefits.
> 
> Any design supporting dtypes must consider:
> 
> - How shape and dtype are determined when an array is created
> - How array elements are stored and accessed
> - The rules for casting dtypes to other dtypes
> 
> In addition:
> 
> - We want dtypes to comprise a class hierarchy open to new types and
> to
>   subhierarchies, as motivated in :ref:`NEP 41 <NEP41>`.
> 
> And to provide this,
> 
> - We need to define a user API.
> 
> All these are the subjects of this NEP.
> 
> - The class hierarchy, its relation to the Python scalar types, and
> its
>   important attributes are described in `nep42_DType class`_.
> 
> - The functionality that will support dtype casting is described in
> `Casting`_.
> 
> - The implementation of item access and storage, and the way shape
> and
> dtype
>   are determined when creating an array, are described in
> :ref:`nep42_array_coercion`.
> 
> - The functionality for users to define their own DTypes is described
> in
>   `Public C-API`_.
> 
> The API here and in NEP 43 is entirely on the C side. A Python-side
> version
> will be proposed in a future NEP. A future Python API is expected to
> be
> similar, but provide a more convenient API to reuse the functionality
> of
> existing DTypes. It could also provide shorthands to create
> structured
> DTypes
> similar to Python's
> `dataclasses <https://docs.python.org/3.8/library/dataclasses.html>`_
> .
> 
> 
> *********************************************************************
> **
> *******
> Usage and impact
> *********************************************************************
> **
> *******
> 
> We believe the few structures in this section are sufficient to
> consolidate
> NumPy's present functionality and also to support complex user-
> defined
> DTypes.
> 
> The rest of the NEP fills in details and provides support for the
> claim.
> 
> Again, though Python is used for illustration, the implementation is
> a
> C API only; a
> future NEP will tackle the Python API.
> 
> After implementing this NEP, creating a DType will be possible by
> implementing
> the following outlined DType base class,
> that is further described in `nep42_DType class`_:
> 
>     class DType(np.dtype):
>         type : type        # Python scalar type
>         parametric : bool  # (may be indicated by superclass)
> 
>         @property
>         def canonical(self) -> bool:
>             raise NotImplementedError
> 
>         def ensure_canonical(self : DType) -> DType:
>             raise NotImplementedError
> 
> For casting, a large part of the functionality is provided by the
> "methods" stored
> in ``_castingimpl``
> 
>         @classmethod
>         def common_dtype(cls : DTypeMeta, other : DTypeMeta) ->
> DTypeMeta:
>             raise NotImplementedError
> 
>         def common_instance(self : DType, other : DType) -> DType:
>             raise NotImplementedError
> 
>         # A mapping of "methods" each detailing how to cast to
> another
> DType
>         # (further specified at the end of the section)
>         _castingimpl = {}
> 
> For array-coercion, also part of casting:
> 
>         def __dtype_setitem__(self, item_pointer, value):
>             raise NotImplementedError
> 
>         def __dtype_getitem__(self, item_pointer, base_obj) ->
> object:
>             raise NotImplementedError
> 
>         @classmethod
>         def __discover_descr_from_pyobject__(cls, obj : object) ->
> DType:
>             raise NotImplementedError
> 
>         # initially private:
>         @classmethod
>         def _known_scalar_type(cls, obj : object) -> bool:
>             raise NotImplementedError
> 
> 
> Other elements of the casting implementation is the ``CastingImpl``:
> 
>     casting = Union["safe", "same_kind", "unsafe"]
> 
>     class CastingImpl:
>         # Object describing and performing the cast
>         casting : casting
> 
>         def resolve_descriptors(self, Tuple[DType] : input) ->
> (casting, Tuple[DType]):
>             raise NotImplementedError
> 
>         # initially private:
>         def _get_loop(...) -> lowlevel_C_loop:
>             raise NotImplementedError
> 
> which describes the casting from one DType to another. In
> NEP 43 this ``CastingImpl`` object is used unchanged to
> support universal functions.
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201027/a5418d23/attachment.sig>


More information about the NumPy-Discussion mailing list