Hi all, I was thinking a bit more about the changes to reduce() that Todd proposed, and have some questions: The problem that the output may not be able to hold the result of an operation is not unique to the reduce() method. For instance adding two arrays of type UInt can also give you the wrong answer:
array(255, UInt8) + array(255, UInt8) 254
So, if this is a general problem, why should only the reduce method be enhanced to avoid this? If you implement this, should this capability not be supported more broadly than only by reduce(), for instance by universal functons such as 'add'? Would it not be unexpected for users that only reduce() provides such added functionality? However, as Paul Dubois pointed out earlier, the original design philosphy of Numeric/numarray was to let the user deal with such problems himself and keep the package small and fast. This seems actually a sound decision, so would it not be better to avoid complicating numarray with these type of changes and also leave reduce as it is? Personally I don't have a need for the proposed changes to the reduce function. My original complaint that started the whole discussion was that the mean() and sum() array methods did not give the correct result in some cases. I still think they should return a correct double precision value, even if the universal functions may not. That could be achieved by a separate implementation that does not use the universal functions. I would be prepared to provide that implementation either to replace the mean and sum methods, or as a separate add-on. Cheers, Peter
1. Add a type parameter to sum which defaults to widest type.
2. Add a type parameter to reductions (and fix output type handling). Default is same-type as it is now. No major changes to C-code.
3. Add a WidestType(array) function:
Bool --> Bool Int8,Int16,Int32,Int64 --> Int64 UInt8, UInt16,UInt32,UInt64 --> UInt64 (Int64 on win32) Float32, Float64 --> Float64 Complex32, Complex64 --> Complex64
The only thing this really leaves out, is a higher performance implementation of sum/mean which Peter referred to a few times. Peter, if you want to write a specialized module, we'd be happy to put it in the add-ons package.
Peter Verveer writes
Hi all,
I was thinking a bit more about the changes to reduce() that Todd proposed, and have some questions:
The problem that the output may not be able to hold the result of an operation is not unique to the reduce() method. For instance adding two arrays of type UInt can also give you the wrong answer:
array(255, UInt8) + array(255, UInt8) 254
So, if this is a general problem, why should only the reduce method be enhanced to avoid this? If you implement this, should this capability not be supported more broadly than only by reduce(), for instance by universal functons such as 'add'? Would it not be unexpected for users that only reduce() provides such added functionality?
Certainly true (and much more likely a problem for integer multiplication than addition). On the other hand, it is more likely to be only an occasional problem for binary operations. With reductions, the risk is severe that overflows will happen. For example, for addition it is the difference between a+a for the normal operation and len(a)*a for the reduction. Arguably reductions on Int8 and Int16 arrays are likely to run into a problem than not.
However, as Paul Dubois pointed out earlier, the original design philosphy of Numeric/numarray was to let the user deal with such problems himself and keep the package small and fast. This seems actually a sound decision, so would it not be better to avoid complicating numarray with these type of changes and also leave reduce as it is?
No, I'm inclined to change reductions because of the high potential for problems, particularly with ints. I don't think ufunc type handling needs to change though. Todd believes that changing reduction behavior would not be difficult (though we will try to finish other work first before doing that). Changing reduction behavior is probably the easiest way of implementing the improved sum and mean functions. The only thing we need to determine is what the default behavior is (Todd proposes the defaults remain the same, I'm not so sure.)
Personally I don't have a need for the proposed changes to the reduce function. My original complaint that started the whole discussion was that the mean() and sum() array methods did not give the correct result in some cases. I still think they should return a correct double precision value, even if the universal functions may not. That could be achieved by a separate implementation that does not use the universal functions. I would be prepared to provide that implementation either to replace the mean and sum methods, or as a separate add-on.
On Thursday 04 September 2003 15:33, Perry Greenfield wrote:
So, if this is a general problem, why should only the reduce method be enhanced to avoid this? If you implement this, should this capability not be supported more broadly than only by reduce(), for instance by universal functons such as 'add'? Would it not be unexpected for users that only reduce() provides such added functionality?
Certainly true (and much more likely a problem for integer multiplication than addition). On the other hand, it is more likely to be only an occasional problem for binary operations. With reductions, the risk is severe that overflows will happen. For example, for addition it is the difference between a+a for the normal operation and len(a)*a for the reduction. Arguably reductions on Int8 and Int16 arrays are likely to run into a problem than not.
That true, but this argument really only holds for the integer types. For 32-bit floating point or complex types it will usually not be necessary to convert to 64-bit to prevent overflow. In that case it may often not be desirable to change the array type. I am not saying that the convert option would not useful for the case of floats, but it is maybe an argument to keep the default behaviour, at least for Float32 and Complex32 types. Generally I do agree that there is no need to change the ufuncs, I did not want to suggest that this actually be implemented...
However, as Paul Dubois pointed out earlier, the original design philosphy of Numeric/numarray was to let the user deal with such problems himself and keep the package small and fast. This seems actually a sound decision, so would it not be better to avoid complicating numarray with these type of changes and also leave reduce as it is?
No, I'm inclined to change reductions because of the high potential for problems, particularly with ints. I don't think ufunc type handling needs to change though. Todd believes that changing reduction behavior would not be difficult (though we will try to finish other work first before doing that). Changing reduction behavior is probably the easiest way of implementing the improved sum and mean functions. The only thing we need to determine is what the default behavior is (Todd proposes the defaults remain the same, I'm not so sure.)
This would solve my problem with mean() and sum(). I think these should certainly return the result in the optimal precision. These may then not be the most optimal in terms of speed, but certainly 'good enough'. I would like to second Todds preference to keep the default behaviour of reductions to be the same as it is now. For reductions, I mostly want the result to be in the same type, because I chose that type in the first place for storage reasons. Cheers, Peter
participants (2)
-
Perry Greenfield
-
Peter Verveer