[Numpy-discussion] missing data discussion round 2

Tue Jun 28 19:39:31 EDT 2011

On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith <njs at pobox.com> wrote:
> ...
> > (You might think, what difference does it make if you *can* unmask an
> > item? Us missing data folks could just ignore this feature. But:
> > whatever we end up implementing is something that I will have to
> > explain over and over to different people, most of them not
> > particularly sophisticated programmers. And there's just no sensible
> > way to explain this idea that if you store some particular value, then
> > it replaces the old value, but if you store NA, then the old value is
> > still there.
>
> Ouch - yes.  No question, that is difficult to explain.   Well, I
> think the explanation might go like this:
>
> "Ah, yes, well, that's because in fact numpy records missing values by
> using a 'mask'.   So when you say `a[3] = np.NA', what you mean is,
> 'a._mask = np.ones(a.shape, np.dtype(bool); a._mask[3] = False`"
>
> Is that fair?
>

My favorite way of explaining it would be to have a grid of numbers written
on paper, then have several cardboards with holes poked in them in different
configurations. Placing these cardboard masks in front of the grid would
show different sets of non-missing data, without affecting the values stored
on the paper behind them.

-Mark

>
> See you,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110628/f27fec30/attachment.html>