[Numpy-discussion] NEP mask code and the 1.7 release

Sun Apr 22 18:15:01 EDT 2012

We need to decide what to do with the NA masking code currently in
master, vis-a-vis the 1.7 release. While this code is great at what it
is, we don't actually have consensus yet that it's the best way to
give our users what they want/need -- or even an appropriate way. So
we need to figure out how to release 1.7 without committing ourselves
to supporting this design in the future.

Background: what does the code currently in master do?
--------------------------------------------

It adds 3 pointers at the end of the PyArrayObject struct (which is
better known as the numpy.ndarray object). These new struct members,
and some accessors for them, are exposed as part of the public API.
There are also a few additions to the Python-level API (mask= argument
to np.array, skipna= argument to ufuncs, etc.)

What does this mean for compatibility?
------------------------------------------------

The change in the ndarray struct is not as problematic as it might
seem, compatibility-wise, since Python objects are almost always
referred to by pointers. Since the initial part of the struct will
continue to have the same memory layout, existing source and binary
code that works with PyArrayObject *pointers* will continue to work
unchanged.

One place where the actual struct size matters is for any C-level
ndarray subclasses, which will have their memory layout change, and
thus will need to be recompiled. (Python-level ndarray subclasses will
have their memory layout change as well -- e.g., they will have
different __dictoffset__ values -- but it's unlikely that any existing
Python code depends on such details.)

What if we want to change our minds later?
-------------------------------------------------------

For the same reasons as given above, any new code which avoids
referencing the new struct fields referring to masks, or using the new
masking APIs, will continue to work even if the masking is later
removed.

Any new code which *does* refer to the new masking APIs, or references
the fields directly, will break if masking is later removed.
Specifically, source will fail to compile, and existing binaries will
silently access memory that is past the end of the PyArrayObject
struct, which will have unpredictable consequences. (Most likely
segfaults, but no guarantees.) This applies even to code which simply
tries to check whether a mask is present.

So I think the preconditions for leaving this code as-is for 1.7 are
that we must agree:
  * We are willing to require a recompile of any C-level ndarray
subclasses (do any exist?)
  * We are willing to make absolutely no guarantees about future
compatibility for code which uses APIs marked "experimental"
  * We are willing for this breakage to occur in the form of random segfaults
  * We are okay with the extra 3 pointers worth of memory overhead on
each ndarray

Personally I can live with all of these if everyone else can, but I'm
nervous about reducing our compatibility guarantees like that, and
we'd probably need, at a minimum, a flashier EXPERIMENTAL sign than we
currently have. (Maybe we should resurrect the weasels ;-) [1])

[1] http://mail.scipy.org/pipermail/numpy-discussion/2012-March/061204.html

Any other options?
------------------------

Alternative 1: The obvious other option is to go through and move all
the strictly mask-related code out of master and into a branch.
Presumably this wouldn't include all the infrastructure that Mark
added, since a lot of it is e.g. shared with where=, and that would
stay. Even so, this would be a big and possibly time-consuming change.

Alternative 2: After auditing the code a bit, the cleanest third
option I can think of is:

1. Go through and make sure that all numpy-internal access to the new
maskna fields happens via the accessor functions. (This patch would
produce no functionality change.)
2. Move the accessors into some numpy-internal header file, so that
user code can't call them.
3. Remove the mask= argument to Python-level ndarray constructors,
remove the new maskna_ fields from PyArrayObject, and modify the
accessors so that they always return NULL, 0, etc., as if the array
does not have a mask.

This would make 1.7 completely compatible with 1.6 API and ABI-wise.
But it would also be a minimal code change, leaving the mask-related
code paths in place but inaccessible. If we decided to re-enable them,
it would just be matter of reverting steps (3) and (2).

The main downside I see with this approach is that leaving a bunch of
inaccessible code paths lying around might make it harder to maintain
1.7 as a "long term support" release.

I'm personally willing to implement either of these changes. Or
perhaps there's another option that I'm not thinking of!

-- Nathaniel