<br><br><div class="gmail_quote">On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett <span dir="ltr"><<a href="mailto:matthew.brett@gmail.com">matthew.brett@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Hi,<br>

<br>

On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe <<a href="mailto:mwwiebe@gmail.com">mwwiebe@gmail.com</a>> wrote:<br>

> On Wed, Jun 29, 2011 at 8:20 AM, Lluís <<a href="mailto:xscript@gmx.net">xscript@gmx.net</a>> wrote:<br>

>><br>

>> Matthew Brett writes:<br>

>><br>

>> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys<br>

>> >> the idea that the entry is still there, but we're just ignoring it.  Of<br>

>> >> course, that goes against common convention, but it might be easier to<br>

>> >> explain.<br>

>><br>

>> > I think Nathaniel's point is that np.IGNORE is a different idea than<br>

>> > np.NA, and that is why joining the implementations can lead to<br>

>> > conceptual confusion.<br>

>><br>

>> This is how I see it:<br>

>><br>

>> >>> a = np.array([0, 1, 2], dtype=int)<br>

>> >>> a[0] = np.NA<br>

>> ValueError<br>

>> >>> e = np.array([np.NA, 1, 2], dtype=int)<br>

>> ValueError<br>

>> >>> b  = np.array([np.NA, 1, 2], dtype=np.maybe(int))<br>

>> >>> m  = np.array([np.NA, 1, 2], dtype=int, masked=True)<br>

>> >>> bm = np.array([np.NA, 1, 2], dtype=np.maybe(int), masked=True)<br>

>> >>> b[1] = np.NA<br>

>> >>> np.sum(b)<br>

>> np.NA<br>

>> >>> np.sum(b, skipna=True)<br>

>> 2<br>

>> >>> b.mask<br>

>> None<br>

>> >>> m[1] = np.NA<br>

>> >>> np.sum(m)<br>

>> 2<br>

>> >>> np.sum(m, skipna=True)<br>

>> 2<br>

>> >>> m.mask<br>

>> [False, False, True]<br>

>> >>> bm[1] = np.NA<br>

>> >>> np.sum(bm)<br>

>> 2<br>

>> >>> np.sum(bm, skipna=True)<br>

>> 2<br>

>> >>> bm.mask<br>

>> [False, False, True]<br>

>><br>

>> So:<br>

>><br>

>> * Mask takes precedence over bit pattern on element assignment. There's<br>

>>  still the question of how to assign a bit pattern NA when the mask is<br>

>>  active.<br>

>><br>

>> * When using mask, elements are automagically skipped.<br>

>><br>

>> * "m[1] = np.NA" is equivalent to "m.mask[1] = False"<br>

>><br>

>> * When using bit pattern + mask, it might make sense to have the initial<br>

>>  values as bit-pattern NAs, instead of masked (i.e., "bm.mask == [True,<br>

>>  False, True]" and "np.sum(bm) == np.NA")<br>

<div class="im">><br>

> There seems to be a general idea that masks and NA bit patterns imply<br>

> particular differing semantics, something which I think is simply false.<br>

<br>

</div>Well - first - it's helpful surely to separate the concepts and the<br>

implementation.<br>

<br>

Concepts / use patterns (as delineated by Nathaniel):<br>

A) missing values == 'np.NA' in my emails.  Can we call that CMV<br>

(concept missing values)?<br>

B) masks == np.IGNORE in my emails . CMSK (concept masks)?<br>

<br>

Implementations<br>

1) bit-pattern == na-dtype - how about we call that IBP<br>

(implementation bit patten)?<br>

2) array.mask.  IM (implementation mask)?<br>

<br></blockquote><div><br>Remember that the masks are invisible, you can't see them, they are an implementation detail. A good reason to hide the implementation is so it can be changed without impacting software that depends on the API.<br>

<br><snip><br><br>Chuck</div></div><br>