[Numpy-discussion] Missing Values Discussion

Bruce Southey bsouthey at gmail.com
Sun Jul 10 22:52:29 EDT 2011


On Fri, Jul 8, 2011 at 4:35 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Fri, Jul 8, 2011 at 8:34 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> On Fri, Jul 8, 2011 at 12:55 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Hi,
>>>
>>> On Fri, Jul 8, 2011 at 6:38 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>> On 07/08/2011 08:58 AM, Matthew Brett wrote:
>>>>> Hi,
>>>>>
>>>>> Just checking - but is this:
>>>>>
>>>>> On Fri, Jul 8, 2011 at 2:22 PM, Bruce Southey<bsouthey at gmail.com>  wrote:
>>>>> ...
>>>>>> The one thing that we do need now is the code that implements the small
>>>>>> set of core ideas (array creation and simple numerical operations).
>>>>>> Hopefully that will provide a better grasp of the concepts and the
>>>>>> performance differences to determine the acceptability of the approach(es).
>>>>> in reference to this:
>>>>>
>>>>>> On 07/08/2011 07:15 AM, Matthew Brett wrote:
>>>>> ...
>>>>>>> Can I ask - what do you recommend that we do now, for the discussion?
>>>>>>> Should we be quiet and wait until there is code to test, or, as
>>>>>>> Nathaniel has tried to do, work at reaching some compromise that makes
>>>>>>> sense to some or all parties?
>>>>> ?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Matthew
>>>> Simply, I think the time for discussion has passed and it is now time to
>>>> see the 'cards'. I do not know enough (or anything) about the
>>>> implementation so I need code to know the actual 'cost' of Mark's idea
>>>> with real situations.
>>>
>>> Yes, I thought that was what you were saying.
>>>
>>> I disagree and think that discussion of the type that Nathaniel has
>>> started is a useful way to think more clearly and specifically about
>>> the API and what can be agreed.
>>>
>>> Otherwise we will come to the same impasse when Mark's code arrives.
>>> If that happens, we'll either lose the code because the merge is
>>> refused, or be forced into something that may not be the best way
>>> forward.
>>>
>>> Best,
>>>
>>> Matthew
>>> _______________________________________________
>>
>>
>> Unfortunately we need code from either side as an API etc. is not
>> sufficient to judge anything.
>
> If I understand correctly, we are not going to get code from either
> side, we are only going to get code from one side.

The would be very unfortunate indeed.

>
> I cannot now see how the code will inform the discussion about the
> API, unless it turns out that the proposed API cannot be implemented.
>  The substantial points are not about memory use or performance, but
> about how the API should work.  If you can see some way that the code
> will inform the discussion, please say, I would honestly be grateful.

API's are not my area or even a concern.  I am an end user so the code
has to work correctly with acceptable performance and memory usage. To
that end I have know if doing a+b is faster with less memory than
first creating new arrays c and d without missing values then doing
c+d. The limited understanding with the masked approach is that the
former it should be faster than the latter with some acceptable
increase in memory usage. With the miniNEP approach, I do not see that
there will be benefits because the function will have to find these
and handle them appropriately which may be a 'killer' for integer
arrays.

>
>> But I do not think we will be forced
>> into anything as in the extreme situation you can keep old versions or
>> fork the code in the really extreme case.
>
> That would be a terrible waste, and potentially damaging to the
> community, so of course we want to do all we can to avoid those
> outcomes.
>
> Best,
>
> Matthew

So I have to support anybody that wants to try a new change especially
one that would remove my 'bane' of having functions automatically
handle masked arrays.

Bruce



More information about the NumPy-Discussion mailing list