[Numpy-discussion] in the NA discussion, what can we agree on?

Benjamin Root ben.root at ou.edu
Thu Nov 3 10:47:53 EDT 2011

On Thu, Nov 3, 2011 at 9:28 AM, Lluís <xscript at gmx.net> wrote:

> Nathaniel Smith writes:
> > 4) There is consensus that whatever approach is taken, there should be
> > a quick and convenient way to identify values that are MISSING,
> > IGNORED, or both. (E.g., functions is_MISSING, is_IGNORED,
> > is_MISSING_or_IGNORED, or some equivalent.)
> Well, maybe it's too low level, but I'd rather decouple the two concepts
> into
> two orthogonal properties that can be composed:
> * Destructiveness: whether the previous data value is lost whenever you
> assign a
>  "special" value.
> * Propagation: whether any of these "special" values is propagated or just
>  skipped when performing computations.
> I think we can all agree on the definition of these two properties (where
> bit-patters are destructive and masks are non-destructive), so I'd say
> that the
> first discussion is establishing whether to expose them as separate
> properties
> or just expose specific combinations of them:
> * MISSING: destructive + propagating
> * IGNORED: non-destructive + non-propagating
> For example, it makes sense to me to have non-destructive + propagating.
This is sort of how it is currently implemented.  By default, NA
propagates, but it is possible to override these defaults on an
operation-by-operation basis using the skipna kwarg, and a subclassed array
could implement a __ufunc_wrap__() to default the skipna kwarg to True.

> If we take this road, then the next points to discuss should probably be
> how
> these combinations are expressed:
> * At the array level: all special values behave the same in a specific
> array,
>  given its properties (e.g., all of them are destructive+propagating).
> * At the value level: each special value conveys a specific combination of
> the
>  aforementioned properties (e.g., assigning A is destructive+propagating
> and
>  assigning B is non-destructive+non-propagating).
> * Hybrid: e.g., all special values are destructive, but propagation
> depends on
>  the specific special value.
> I think this last decision is crucial, as it will have a direct impact on
> performance, numpy code maintainability and 3rd party interface simplicity.
This is actually a very good point, and plays directly on the types of
implementations that can be done.  Currently, Mark's implementation is the
first one. The others are not possible with the current design.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111103/0118450d/attachment.html>

More information about the NumPy-Discussion mailing list