[Numpy-discussion] Missing data again

Charles R Harris charlesr.harris at gmail.com
Wed Mar 7 13:48:05 EST 2012


On Wed, Mar 7, 2012 at 11:21 AM, Lluís <xscript at gmx.net> wrote:

> Charles R Harris writes:
> [...]
> > One inconvenience I have run into with the current API is that is should
> be
> > easier to clear the mask from an "ignored" value without taking a new
> view or
> > assigning known data.
>
> AFAIR, the inability to directly access a "mask" attribute was intentional
> to
> make bit-patterns and masks indistinguishable from the POV of the array
> user.
>
> What's the workflow that leads you to un-ignore specific elements?
>
>
>
Because they are not 'unknown', just (temporarily) 'ignored'. This might be
the case if you are experimenting with what happens if certain data is left
out of a fit. The current implementation tries to handle both these case,
and can do so, I would just like the 'ignored' use to be more convenient
than it is.


> > So maybe two types of masks (different payloads), or an additional flag
> could
> > be helpful.
>
> Do you mean different NA values? If that's the case, I think it was taken
> into
> account when implementing the current mechanisms (and was also mentioned
> in the
> NEP), so that it could be supported by both bit-patterns and masks (as one
> of
> the main design points was to make them indistinguishable in the common
> case).
>
>
No, the mask as currently implemented is eight bits and can be extended to
handle different mask values, aka, payloads.


> I think the name was "parametrized dtypes".
>
>
They don't interest me in the least. But that is a whole different area of
discussion.


>
> > The process of assigning masks could also be made a bit easier than using
> > fancy indexing.
>
> I don't get what you mean here, sorry.
>
>
Suppose I receive a data set, say an hdf file, that also includes a mask.
I'd like to load the data and apply the mask directly without doing
something like

data[mask] = np.NA


Do you mean here that this is too cumbersome to use?
>
>    >>> a[a < 5] = np.NA
>
> (obviously oversimplified example where everything looks sufficiently
> simple :))
>
>
Mostly speed and memory.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/888760aa/attachment.html>


More information about the NumPy-Discussion mailing list