What about a parameter that allow to select the option the user want? it would select between uint, upcasted_int, -MAX and +MAX. This way, at least it will be documented and user who care will have the choose. Personally, when the option is available, I would prefer the safe version, uint, but I understand that is not all people position. Frédéric Bastien On Sat, Oct 15, 2011 at 3:00 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
Hi,
On Wed, Oct 12, 2011 at 8:31 AM, David Cournapeau <cournape@gmail.com> wrote:
On 10/12/11, "V. Armando Solé" <sole@esrf.fr> wrote:
On 12/10/2011 10:46, David Cournapeau wrote:
From a pure user perspective, I would not expect the abs function to return a negative number. Returning +127 plus a warning the first time that happens seems to me a good compromise. I guess the question is what's the common context to use small integers in the first place. If it is to save memory, then upcasting may not be the best solution. I may be wrong, but if you decide to use
On Wed, Oct 12, 2011 at 9:18 AM, "V. Armando Solé" wrote: those types in the first place, you need to know about overflows. Abs is just one of them (dividing by -1 is another, although this one actually raises an exception).
Detecting it may be costly, but this would need benchmarking.
That being said, without context, I don't find 127 a better solution than -128.
Well that choice is just based on getting the closest positive number to the true value (128). The context can be anything, for instance you could be using a look up table based on the result of an integer operation ...
In terms of cost, it would imply to evaluate the cost of something like:
a = abs(x); if (a < 0) {a -= MIN_INT;} return a;
Yes, this is costly: it adds a branch to a trivial operation. I did some preliminary benchmarks (would need confirmation when I have more than one minute to spend on this):
int8, 2**16 long array. Before check: 16 us. After check: 92 us. 5-6 times slower int8, 2**24 long array. Before check: 20ms. After check: 30ms. 30 % slower.
There is also the issue of signaling the error in the ufunc machinery. I forgot whether this is possible at that level.
I suppose that returning the equivalent uint type would be of zero cost though?
I don't think the problem should be relegated to 'people should know about this' because this a problem for any signed integer type, and it can lead to nasty errors which people are unlikely to test for.
See you,
Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion