[Numpy-discussion] Missing data wrap-up and request for comments

Travis Oliphant travis at continuum.io
Wed May 9 18:12:05 EDT 2012


On re-reading, I want to make a couple of things clear:   

	1) This "wrap-up" discussion is *only* for what to do for NumPy 1.7 in such a way that we don't tie our hands in the future.    I do not believe we can figure out what to do for masked arrays in one short week.   What happens beyond NumPy 1.7 should be still discussed and explored.    My urgency is entirely about moving forward from where we are in master right now in a direction that we can all accept.      The tight timeline is so that we do *something* and move forward.    

	2) I missed another possible proposal for NumPy 1.7 which is in the write-up that Mark and Nathaniel made:  remove the masked array additions entirely possibly moving them to another module like numpy-dtypes.

Again, these are only for NumPy 1.7.   What happens in any future NumPy and beyond will depend on who comes to the table for both discussion and code-development. 

Best regards,

-Travis



On May 9, 2012, at 11:46 AM, Travis Oliphant wrote:

> Hey all, 
> 
> Nathaniel and Mark have worked very hard on a joint document to try and explain the current status of the missing-data debate.   I think they've done an amazing job at providing some context, articulating their views and suggesting ways forward in a mutually respectful manner.   This is an exemplary collaboration and is at the core of why open source is valuable. 
> 
> The document is available here: 
>    https://github.com/numpy/numpy.scipy.org/blob/master/NA-overview.rst
> 
> After reading that document, it appears to me that there are some fundamentally different views on how things should move forward.   I'm also reading the document incorporating my understanding of the history, of NumPy as well as all of the users I've met and interacted with which means I have my own perspective that is not necessarily incorporated into that document but informs my recommendations.    I'm not sure we can reach full consensus on this.     We are also well past time for moving forward with a resolution on this (perhaps we can all agree on that).     
> 
> I would like one more discussion thread where the technical discussion can take place.    I will make a plea that we keep this discussion as free from logical fallacies http://en.wikipedia.org/wiki/Logical_fallacy as we can.   I can't guarantee that I personally will succeed at that, but I can tell you that I will try.   That's all I'm asking of anyone else.    I recognize that there are a lot of other issues at play here besides *just* the technical questions, but we are not going to resolve every community issue in this technical thread. 
> 
> We need concrete proposals and so I will start with three.   Please feel free to comment on these proposals or add your own during the discussion.    I will stop paying attention to this thread next Wednesday (May 16th) (or earlier if the thread dies) and hope that by that time we can agree on a way forward.  If we don't have agreement, then I will move forward with what I think is the right approach.   I will either write the code myself or convince someone else to write it. 
> 
> In all cases, we have agreement that bit-pattern dtypes should be added to NumPy.      We should work on these (int32, float64, complex64, str, bool) to start.    So, the three proposals are independent of this way forward.   The proposals are all about the extra mask part:  
> 
> My three proposals: 
> 
> 	* do nothing and leave things as is 
> 
> 	* add a global flag that turns off masked array support by default but otherwise leaves things unchanged (I'm still unclear how this would work exactly)
> 
> 	* move Mark's "masked ndarray objects" into a new fundamental type (ndmasked), leaving the actual ndarray type unchanged.  The array_interface keeps the masked array notions and the ufuncs keep the ability to handle arrays like ndmasked.    Ideally, numpy.ma would be changed to use ndmasked objects as their core. 
> 
> For the record, I'm currently in favor of the third proposal.   Feel free to comment on these proposals (or provide your own). 
> 
> Best regards,
> 
> -Travis
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120509/c9711c04/attachment.html>


More information about the NumPy-Discussion mailing list