[Numpy-discussion] missing data discussion round 2

eat e.antero.tammi at gmail.com
Tue Jun 28 19:00:43 EDT 2011


Hi,

On Wed, Jun 29, 2011 at 1:40 AM, Jason Grout <jason-sage at creativetrax.com>wrote:

> On 6/28/11 5:20 PM, Matthew Brett wrote:
> > Hi,
> >
> > On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith<njs at pobox.com>  wrote:
> > ...
> >> (You might think, what difference does it make if you *can* unmask an
> >> item? Us missing data folks could just ignore this feature. But:
> >> whatever we end up implementing is something that I will have to
> >> explain over and over to different people, most of them not
> >> particularly sophisticated programmers. And there's just no sensible
> >> way to explain this idea that if you store some particular value, then
> >> it replaces the old value, but if you store NA, then the old value is
> >> still there.
> >
> > Ouch - yes.  No question, that is difficult to explain.   Well, I
> > think the explanation might go like this:
> >
> > "Ah, yes, well, that's because in fact numpy records missing values by
> > using a 'mask'.   So when you say `a[3] = np.NA', what you mean is,
> > 'a._mask = np.ones(a.shape, np.dtype(bool); a._mask[3] = False`"
> >
> > Is that fair?
>
> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys
> the idea that the entry is still there, but we're just ignoring it.  Of
> course, that goes against common convention, but it might be easier to
> explain.
>
Somehow very similar approach how I always have treated the NaNs.
(Thus postponing all the real (slightly dirty) work  on to the imputation
procedures).

For me it has been sufficient to ignore what's the actual cause of NaNs. But
I believe there exists plenty other much more sophisticated situations where
this  kind of simple treatment is not sufficient, at all. Anyway, even in
the future it should still be possible to play nicely with these kind of
simple scenarios.

- eat

>
> Thanks,
>
> Jason
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110629/35562fe6/attachment.html>


More information about the NumPy-Discussion mailing list