[Numpy-discussion] New NEP: merging multiarray and umath

Eric Wieser wieser.eric+numpy at gmail.com
Thu Mar 8 03:47:46 EST 2018


This means that ndarray needs to know about ufuncs – so instead of a clean
layering, we have a circular dependency.

Perhaps we should split ndarray into a base_ndarray class with no
arithmetic support (*add*, sum, etc), and then provide an ndarray subclass
from umath instead (either the separate extension, or just a different set
of files)
​

On Thu, 8 Mar 2018 at 08:25 Nathaniel Smith <njs at pobox.com> wrote:

> Hi all,
>
> Well, this is something that we've discussed for a while and I think
> generally has consensus already, but I figured I'd write it down
> anyway to make sure.
>
> There's a rendered version here:
>
> https://github.com/njsmith/numpy/blob/nep-0015-merge-multiarray-umath/doc/neps/nep-0015-merge-multiarray-umath.rst
>
> -----
>
> ============================
> Merging multiarray and umath
> ============================
>
> :Author: Nathaniel J. Smith <njs at pobox.com>
> :Status: Draft
> :Type: Standards Track
> :Created: 2018-02-22
>
>
> Abstract
> --------
>
> Let's merge ``numpy.core.multiarray`` and ``numpy.core.umath`` into a
> single extension module, and deprecate ``np.set_numeric_ops``.
>
>
> Background
> ----------
>
> Currently, numpy's core C code is split between two separate extension
> modules.
>
> ``numpy.core.multiarray`` is built from
> ``numpy/core/src/multiarray/*.c``, and contains the core array
> functionality (in particular, the ``ndarray`` object).
>
> ``numpy.core.umath`` is built from ``numpy/core/src/umath/*.c``, and
> contains the ufunc machinery.
>
> These two modules each expose their own separate C API, accessed via
> ``import_multiarray()`` and ``import_umath()`` respectively. The idea
> is that they're supposed to be independent modules, with
> ``multiarray`` as a lower-level layer with ``umath`` built on top. In
> practice this has turned out to be problematic.
>
> First, the layering isn't perfect: when you write ``ndarray +
> ndarray``, this invokes ``ndarray.__add__``, which then calls the
> ufunc ``np.add``. This means that ``ndarray`` needs to know about
> ufuncs – so instead of a clean layering, we have a circular
> dependency. To solve this, ``multiarray`` exports a somewhat
> terrifying function called ``set_numeric_ops``. The bootstrap
> procedure each time you ``import numpy`` is:
>
> 1. ``multiarray`` and its ``ndarray`` object are loaded, but
>    arithmetic operations on ndarrays are broken.
>
> 2. ``umath`` is loaded.
>
> 3. ``set_numeric_ops`` is used to monkeypatch all the methods like
>    ``ndarray.__add__`` with objects from ``umath``.
>
> In addition, ``set_numeric_ops`` is exposed as a public API,
> ``np.set_numeric_ops``.
>
> Furthermore, even when this layering does work, it ends up distorting
> the shape of our public ABI. In recent years, the most common reason
> for adding new functions to ``multiarray``\'s "public" ABI is not that
> they really need to be public or that we expect other projects to use
> them, but rather just that we need to call them from ``umath``. This
> is extremely unfortunate, because it makes our public ABI
> unnecessarily large, and since we can never remove things from it then
> this creates an ongoing maintenance burden. The way C works, you can
> have internal API that's visible to everything inside the same
> extension module, or you can have a public API that everyone can use;
> you can't have an API that's visible to multiple extension modules
> inside numpy, but not to external users.
>
> We've also increasingly been putting utility code into
> ``numpy/core/src/private/``, which now contains a bunch of files which
> are ``#include``\d twice, once into ``multiarray`` and once into
> ``umath``. This is pretty gross, and is purely a workaround for these
> being separate C extensions.
>
>
> Proposed changes
> ----------------
>
> This NEP proposes three changes:
>
> 1. We should start building ``numpy/core/src/multiarray/*.c`` and
>    ``numpy/core/src/umath/*.c`` together into a single extension
>    module.
>
> 2. Instead of ``set_numeric_ops``, we should use some new, private API
>    to set up ``ndarray.__add__`` and friends.
>
> 3. We should deprecate, and eventually remove, ``np.set_numeric_ops``.
>
>
> Non-proposed changes
> --------------------
>
> We don't necessarily propose to throw away the distinction between
> multiarray/ and umath/ in terms of our source code organization:
> internal organization is useful! We just want to build them together
> into a single extension module. Of course, this does open the door for
> potential future refactorings, which we can then evaluate based on
> their merits as they come up.
>
> It also doesn't propose that we break the public C ABI. We should
> continue to provide ``import_multiarray()`` and ``import_umath()``
> functions – it's just that now both ABIs will ultimately be loaded
> from the same C library. Due to how ``import_multiarray()`` and
> ``import_umath()`` are written, we'll also still need to have modules
> called ``numpy.core.multiarray`` and ``numpy.core.umath``, and they'll
> need to continue to export ``_ARRAY_API`` and ``_UFUNC_API`` objects –
> but we can make one or both of these modules be tiny shims that simply
> re-export the magic API object from where-ever it's actually defined.
> (See ``numpy/core/code_generators/generate_{numpy,ufunc}_api.py`` for
> details of how these imports work.)
>
>
> Backward compatibility
> ----------------------
>
> The only compatibility break is the deprecation of ``np.set_numeric_ops``.
>
>
> Alternatives
> ------------
>
> n/a
>
>
> Discussion
> ----------
>
> TBD
>
>
> Copyright
> ---------
>
> This document has been placed in the public domain.
>
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180308/b52dc4e4/attachment-0001.html>


More information about the NumPy-Discussion mailing list