[Numpy-discussion] numpy error handling

Sat Apr 1 14:01:04 EST 2006

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>>>
>>> You can get the numarray approach back simply by setting the error 
>>> in the builtin scope (instead of in the local scope which is done by 
>>> default.
>>
>>
>> I saw that you could set it at different levels, but missed the 
>> implications. However, it's still missing one feature, thread local 
>> storage. I would argue that the __builtin__ data should actually be 
>> stored in threading.local() instead of __builtin__. Then you could 
>> setup an equivalent stack system to numpy's.
>
> Yes, the per-thread storage escaped me.    But, threading.local() only 
> exists in Python 2.4 and NumPy is supposed to be compatible with 
> Python 2.3
>
> What about PyThreadState_GetDict() ? and then default to use the 
> builtin dictionary if this returns NULL?

That sounds reasonable. I've never used that, but the name sounds promising!

> I'm actually not particularly enthused about the three name-space 
> lookups.   Changing it to only 1 place to look may be better.  It 
> would require a setting and restoring operation.  A stack could be 
> used, but why not just use local variables (i.e.
> save = numpy.seterr(dividebyzero='warn')
>
> ...
>
> numpy.seterr(restore=save)

That would work as well, I think. It gets a little hairy if you want to 
set error nestedly in a single function, but I've never done that, so 
I'm not too worried about it. Besides, what I really want to support is 
'with', which I imagine we can support using the above as a base.

>> I've used the numarray error handling stuff for some time. My 
>> experience with it has led me to the following conclusions:
>>
>>   1. You don't use it that often. I have about 26 KLOC that's "active"
>>      and in that I use pushMode just 15 times. For comparison, I use
>>      asarray a tad over 100 times.
>>   2. pushMode and popMode, modulo spelling,  is the way to set errors.
>>      Once the with  statement is around, that will be even better.
>>   3. I, personally, would be very unlikely to use the local and global
>>      error handling, I'd just as soon see them go away, particularly if
>>      it helps performance, but I won't lobby for it.
>>
>
> This is good feedback.  I have almost zero experience with changing 
> the error handling.  So, I'm not sure what features are desireable.  
> Eliminating unnecessary name-lookups is usually a good thing.

I hope some of the other numarray users chime in. A sample of one is not 
very good data!

>> In numarray, the stack is in the numarray module itself (actually in 
>> the Error object). They base their threading local behaviour off of 
>> thread.get_ident, not threading.local.  That's not clunky at all, 
>> although it's arguably wrong since thread.get_ident can reuse ids 
>> from dead threads. In practice it's probably hard to get into trouble 
>> doing this, but I still wouldn't emulate it. I think that this was 
>> written before thread local storage, so it was probably the best that 
>> could be done.
>
>
> Right, but thread local storage is still Python 2.4 only....
>
> What about PyThreadState_GetDict() ?

That sounds reasonable. Essentially we would be rolling our own 
threading.local()

>>
>> However, if you use threading.local, it will be clunky in a similar 
>> sense. You'll be storing data in a global  namespace you don't 
>> control and you've got to hope that no one stomps on your variable name. 
>
> The PyThreadState_GetDict() documenation states that extension module 
> writers should use a unique name based on their extension module.
>
>> When you have local and module level secret storage names as well 
>> you're just doing a lot more of that and the chance of collision and 
>> confusion goes up from almost zero to very small.
>
> This is true.   Similar to the C-variable naming issues.
>
>>> So, we should at least frame the discussion in terms of what is 
>>> actually possible.
>>
>>
>> Yes, sorry for spreading misinformation.
>
>
> But you did point out the very important thread-local storage fact 
> that I had missed.   This alone makes me willing to revamp what we are 
> doing.
>
>>
>> In this case, overflow, underflow and dividebyzero seem pretty self 
>> documenting to me. And 'invalid' is pretty cryptic in both 
>> implementations. This may be a matter of taste, but I tend to prefer 
>> short pithy names for functions that I use a lot, or that crammed a 
>> bunch to a line. In functions like this, that are more rarely used 
>> and get a full line to themselves I lean to towards the more verbose.
>
>
> The rarely-used factor is a persuasive argument. 
>
>> Can you elaborate on this a bit? Reading between the lines, there 
>> seem to be two issues related to speed here.  One is the actual 
>> namespace lookup of the error mode -- there's a setting that says we 
>> are using the defaults, so don't bother to look. This saves the 
>> namespace lookup.  Changing the defaults shouldn't affect the timing 
>> of that. I'm not sure how this would interact with thread local 
>> storage though.
>>
>> The second issue is that running the core loop with no checks in 
>> place is faster.
>
> Basically, on the C-level, the error mode is an integer with specific 
> bits allocated to the various error-possibilites (2-bits per 
> possibility).   If this is 0 then the error checking is not even done 
> (thus no error handling at all).
> Yes the name-lookup optimization could work with any defaults (but 
> with thread-specific storage couldn't work anyway).
>
> One question I have with threads and error handling though?  Right 
> now, the ufuncs release the Python lock during computation (and 
> re-acquire it to do error handling if needed).   If another ufunc was 
> started by another Python thread and ran with different error 
> handling, wouldn't the IEEE flags get confused about which ufunc was 
> setting what?  The flags are only checked after each 1-d loop.  If 
> another thread set the processor flag, the current thread could get 
> very confused.
>
> This seems like a problem that I'm not sure how to handle. 

Yeah, me either. It seems that somehow we'll need to block until all 
current operations are done, but I don't know how to do that off the top 
of my head. Perhaps ufuncs need to lock the flags when they start and 
release them when they finish. This looks feasible, but I'm not sure of 
the proper incantation to get this right. The ufuncs would all need to 
be able able to increment and decrement the lock, whatever it is, even 
though they are in different threads. Meanwhile the setting code should 
only be able to work when the lock is unheld. It's some sort of poly 
thread recursive lock thing. I'll think about it, perhaps there's an 
obvious way.

>>
>> It's not entirely plucked out of the error. As I recall, the decision 
>> was arrived at something likes this:
>>
>>   1. Errors should never pass silently (unless explicitly silenced).
>>   2. Let's have everything raise by default
>>   3. In practice this was no good because you often wanted to look at
>>      the results and see where the problem was.
>>   4. OK, let's have everything warn
>>   5. This almost worked, but underflow was almost never a real error,
>>      so everyone always overrode underflow. A default that you always
>>      need to override is not a good default.
>>   6. So, warn for everything except underflow. Ignore that.
>>
>> And that's where numarry is today. I and other have been using that 
>> error system happily for quite some time now. At least I haven't 
>> heard any complaints for quite a while.
>
>
> I can appreciate this choice, but I don't agree that errors should 
> never pass silently. 

You'll notice that we ended up with a slightly more nuanced choice. 
Besides, the full quote is import: "errors should not pass silently 
unless explicitly silenced". That's quite a bit different than a blanket 
error should never pass silently.

> The fact that people disagree about this is the reason for the error 
> handling.    

Yes. While I like the above defaults, if we have a reasonable approach I 
can just set them at startup and forget about them. Let's try not to 
penalize me too much for that though.

> Note that overflow is not detected everywhere for integers --- we have 
> to simulate the floating-point errors for them.  Only on integer 
> multiply is it detected.   Checking for it would slow down all other 
> integer arithmetic --- one solution, of course is to have two 
> different integer additions (one that checks for overflow and another 
> that doesn't).

Or just document it and don't worry about it. If I'm doing integer 
arithmetic and I need overflow detection, I can generally cast to 
doubles and do my math there, casting back at the end as needed. This 
doesn't seem worth too much extra complication.

Is my floating point bias showing?

> There is really a bit of work left here to do.

Yep. Looks like it, but nothing insurmountable.

-tim