[Numpy-discussion] Missing data wrap-up and request for comments
Charles R Harris
charlesr.harris at gmail.com
Wed May 9 13:08:26 EDT 2012
On Wed, May 9, 2012 at 10:46 AM, Travis Oliphant <travis at continuum.io>wrote:
> Hey all,
> Nathaniel and Mark have worked very hard on a joint document to try and
> explain the current status of the missing-data debate. I think they've
> done an amazing job at providing some context, articulating their views and
> suggesting ways forward in a mutually respectful manner. This is an
> exemplary collaboration and is at the core of why open source is valuable.
> The document is available here:
> After reading that document, it appears to me that there are some
> fundamentally different views on how things should move forward. I'm also
> reading the document incorporating my understanding of the history, of
> NumPy as well as all of the users I've met and interacted with which means
> I have my own perspective that is not necessarily incorporated into that
> document but informs my recommendations. I'm not sure we can reach full
> consensus on this. We are also well past time for moving forward with a
> resolution on this (perhaps we can all agree on that).
> I would like one more discussion thread where the technical discussion can
> take place. I will make a plea that we keep this discussion as free from
> logical fallacies http://en.wikipedia.org/wiki/Logical_fallacy as we can.
> I can't guarantee that I personally will succeed at that, but I can tell
> you that I will try. That's all I'm asking of anyone else. I recognize
> that there are a lot of other issues at play here besides *just* the
> technical questions, but we are not going to resolve every community issue
> in this technical thread.
> We need concrete proposals and so I will start with three. Please feel
> free to comment on these proposals or add your own during the discussion.
> I will stop paying attention to this thread next Wednesday (May 16th) (or
> earlier if the thread dies) and hope that by that time we can agree on a
> way forward. If we don't have agreement, then I will move forward with
> what I think is the right approach. I will either write the code myself
> or convince someone else to write it.
> In all cases, we have agreement that bit-pattern dtypes should be added to
> NumPy. We should work on these (int32, float64, complex64, str, bool)
> to start. So, the three proposals are independent of this way forward.
> The proposals are all about the extra mask part:
> My three proposals:
> * do nothing and leave things as is
> * add a global flag that turns off masked array support by default but
> otherwise leaves things unchanged (I'm still unclear how this would work
> * move Mark's "masked ndarray objects" into a new fundamental type
> (ndmasked), leaving the actual ndarray type unchanged. The array_interface
> keeps the masked array notions and the ufuncs keep the ability to handle
> arrays like ndmasked. Ideally, numpy.ma would be changed to use
> ndmasked objects as their core.
The numpy.ma is unmaintained and I don't see that changing anytime soon. As
you know, I would prefer 1), but 2) is a good compromise and the infra
structure for such a flag could be useful for other things, although like
yourself I'm not sure how it would be implemented. I don't understand your
proposal for 3), but from the description I don't see that it buys anything.
> For the record, I'm currently in favor of the third proposal. Feel free
> to comment on these proposals (or provide your own).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion