[Numpy-discussion] min() of array containing NaN
Andrew Dalke
dalke at dalkescientific.com
Thu Aug 14 15:07:20 EDT 2008
Anne Archibald:
> Sadly, it's not possible without extra overhead. Specifically: the
> NaN-ignorant implementation does a single comparison between each
> array element and a placeholder, and decides based on the result
which
> to keep.
Did my example code go through? The test for NaN only needs to be
done when a new min value is found, which will occur something like
O(log(n)) in a randomly distributed array.
(Here's the hand-waving. The first requires a NaN check. The
second has a 1/2 chance of being the new minimum. The third has
a 1/3 chance, etc. The sum of the harmonic series goes as O(ln(n)).)
This depends on a double inverting so the test for a new min value and
a test for NaN occur at the same time. Here's pseudocode:
best = array[0]
if isnan(best):
return best
for item in array[1:]:
if !(best <= item):
best = item
if isnan(best):
return best
return item
> If you're willing to do two tests, sure, but that's overhead (and
> probably comparable to an isnan).
In Python the extra inversion costs an extra PVM instruction. In C
by comparison the resulting assembly code for "best > item" and
"!(best <= item)" have identical lengths, with no real performance
difference.
There's no extra cost for doing the extra inversion in the common
case, and for large arrays the ratio of (NaN check) / (no check) -> 1.0.
> What do compilers' min builtins do with NaNs? This might well be
> faster than an if statement even in the absence of NaNs...
This comes from a g++ implementation of min:
/**
* @brief This does what you think it does.
* @param a A thing of arbitrary type.
* @param b Another thing of arbitrary type.
* @return The lesser of the parameters.
*
* This is the simple classic generic implementation. It will
work on
* temporary expressions, since they are only evaluated once,
unlike a
* preprocessor macro.
*/
template<typename _Tp>
inline const _Tp&
min(const _Tp& __a, const _Tp& __b)
{
// concept requirements
__glibcxx_function_requires(_LessThanComparableConcept<_Tp>)
//return __b < __a ? __b : __a;
if (__b < __a)
return __b;
return __a;
}
The isnan function another version of gcc uses a bunch of
#defs, leading to
static __inline__ int __inline_isnanf( float __x ) { return
__x != __x; }
static __inline__ int __inline_isnand( double __x )
{ return __x != __x; }
static __inline__ int __inline_isnan( long double __x )
{ return __x != __x; }
Andrew
dalke at dalkescientific.com
More information about the NumPy-Discussion
mailing list