[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Matthew Brett matthew.brett at gmail.com
Sat Oct 29 17:36:08 EDT 2011


Hi,

On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
> <ralf.gommers at googlemail.com> wrote:
>>
>>
>> On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
>>> <ralf.gommers at googlemail.com> wrote:
>>> >
>>> >
>>> > On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett <matthew.brett at gmail.com>
>>> > wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>>> >> <ralf.gommers at googlemail.com> wrote:
>>> >> >
>>> >> >
>>> >> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
>>> >> > <matthew.brett at gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>>> >> >> <charlesr.harris at gmail.com> wrote:
>>> >> >> >>
>>> >> >>
>>> >> >> No, that's not what Nathaniel and I are saying at all. Nathaniel was
>>> >> >> pointing to links for projects that care that everyone agrees before
>>> >> >> they go ahead.
>>> >> >
>>> >> > It looked to me like there was a serious intent to come to an
>>> >> > agreement,
>>> >> > or
>>> >> > at least closer together. The discussion in the summer was going
>>> >> > around
>>> >> > in
>>> >> > circles though, and was too abstract and complex to follow. Therefore
>>> >> > Mark's
>>> >> > choice of implementing something and then asking for feedback made
>>> >> > sense
>>> >> > to
>>> >> > me.
>>> >>
>>> >> I should point out that the implementation hasn't - as far as I can
>>> >> see - changed the discussion.  The discussion was about the API.
>>> >>
>>> >> Implementations are useful for agreed APIs because they can point out
>>> >> where the API does not make sense or cannot be implemented.  In this
>>> >> case, the API Mark said he was going to implement - he did implement -
>>> >> at least as far as I can see.  Again, I'm happy to be corrected.
>>> >
>>> > Implementations can also help the discussion along, by allowing people
>>> > to
>>> > try out some of the proposed changes. It also allows to construct
>>> > examples
>>> > that show weaknesses, possibly to be solved by an alternative API. Maybe
>>> > you
>>> > can hold the complete history of this topic in your head and comprehend
>>> > it,
>>> > but for me it would be very helpful if someone said:
>>> > - here's my dataset
>>> > - this is what I want to do with it
>>> > - this is the best I can do with the current implementation
>>> > - here's how API X would allow me to solve this better or simpler
>>> > This can be done much better with actual data and an actual
>>> > implementation
>>> > than with a design proposal. You seem to disagree with this statement.
>>> > That's fine. I would hope though that you recognize that concrete
>>> > examples
>>> > help people like me, and construct one or two to help us out.
>>> That's what use-cases are for in designing APIs.  There are examples
>>> of use in the NEP:
>>>
>>> https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
>>>
>>> the alterNEP:
>>>
>>> https://gist.github.com/1056379
>>>
>>> and my longer email to Travis:
>>>
>>>
>>> http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
>>>
>>> Mark has done a nice job of documentation:
>>>
>>> http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
>>>
>>> If you want to understand what the alterNEP case is, I'd suggest the
>>> email, just because it's the most recent and I think the terminology
>>> is slightly clearer.
>>>
>>> Doing the same examples on a larger array won't make the point easier
>>> to understand.  The discussion is about what the right concepts are,
>>> and you can help by looking at the snippets of code in those
>>> documents, and deciding for yourself whether you think the current
>>> masking / NA implementation seems natural and easy to explain, or
>>> rather forced and difficult to explain, and then email back trying to
>>> explain your impression (which is not always easy).
>>
>> If you seriously believe that looking at a few snippets is as helpful and
>> instructive as being able to play around with them in IPython and modify
>> them, then I guess we won't make progress in this part of the discussion.
>> You're just telling me to go back and re-read things I'd already read.
>
> The snippets are in ipython or doctest format - aren't they?

Oops - 10 minute rule.  Now I see that you mean that you can't
experiment with the alternative implementation without working code.
That's true, but I am hoping that the difference between - say:

a[0:2] = np.NA

and

a.mask[0:2] = False

would be easy enough to imagine.   If it isn't then, let me know,
preferably with something like "I can't see exactly how the following
[code snippet] would work in your conception of the problem" - and
then I can either try and give fake examples, or write a mock up.

Best,

Matthew



More information about the NumPy-Discussion mailing list