Hi, I just learn about the existence of round(). Is the following exposing a bug?
N.__version__ '1.0b4' N.array([65.0]) [ 65.] _.round(-1) [ 60.] round(65, -1) 70.0 N.array([65]) [65] _.round(-1) [60] N.array([66]) [66] _.round(-1) [60] N.array([66.2]) [ 66.2] _.round(-1) [ 70.]
65 should round *up* to 70. (when decimals is given as -1) In numpy even 66 rounds down, but 66.2 rounds up ... - Sebastian Haase
On 9/3/06, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Hi, I just learn about the existence of round(). Is the following exposing a bug?
N.__version__ '1.0b4' N.array([65.0]) [ 65.] _.round(-1) [ 60.] round(65, -1) 70.0 N.array([65]) [65] _.round(-1) [60] N.array([66]) [66] _.round(-1) [60] N.array([66.2]) [ 66.2] _.round(-1) [ 70.]
65 should round *up* to 70. (when decimals is given as -1) In numpy even 66 rounds down, but 66.2 rounds up ...
Numpy rounds to even.
around(arange(10)*.5) array([ 0., 0., 1., 2., 2., 2., 3., 4., 4., 4.])
i.e., when at those x.5 boundaries, round to the nearest even number. Always rounding in the same direction biases the numbers when you have gazillions of them. This is most likely of import when the fpu is rounding intermediate results to register length, but numpy does it too. Floats can be weird anyway, .15, for instance, can't be represented exactly as an ieee float. It does tend to throw folks for a loop when they don't keep this in mind. - Sebastian Haase Chuck
Charles R Harris wrote:
On 9/3/06, *Sebastian Haase* <haase@msg.ucsf.edu <mailto:haase@msg.ucsf.edu>> wrote:
Hi, I just learn about the existence of round(). Is the following exposing a bug? >>> N.__version__ '1.0b4' >>> N.array([65.0]) [ 65.] >>> _.round(-1) [ 60.] >>> round(65, -1) 70.0 >>> N.array([65]) [65] >>> _.round(-1) [60] >>> N.array([66]) [66] >>> _.round(-1) [60] >>> N.array([66.2]) [ 66.2] >>> _.round(-1) [ 70.]
65 should round *up* to 70. (when decimals is given as -1) In numpy even 66 rounds down, but 66.2 rounds up ...
Numpy rounds to even.
around(arange(10)*.5) array([ 0., 0., 1., 2., 2., 2., 3., 4., 4., 4.])
i.e., when at those x.5 boundaries, round to the nearest even number. Always rounding in the same direction biases the numbers when you have gazillions of them. This is most likely of import when the fpu is rounding intermediate results to register length, but numpy does it too. Floats can be weird anyway, .15, for instance, can't be represented exactly as an ieee float. It does tend to throw folks for a loop when they don't keep this in mind.
- Sebastian Haase
Chuck
Thanks for the reply - but read what the doc says:
N.around.__doc__ 'Round 'a' to the given number of decimal places. Rounding behaviour is equivalent to Python.
Return 'a' if the array is not floating point. Round both the real and imaginary parts separately if the array is complex. ' it is *not* done this way in Python:
round(.5) 1.0 round(1.5) 2.0 round(2.5) 3.0
( the round obj. method is missing this doc string ) I really think we should stick to what the doc string say - everybody expects x.5 to round up !! look even at this:
repr(2.05) '2.0499999999999998' round(2.05, 1) 2.1
(don't ask me how Python does this, but it's "nice" ) Thanks, Sebastian
On 04/09/06, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Thanks for the reply - but read what the doc says:
N.around.__doc__ 'Round 'a' to the given number of decimal places. Rounding behaviour is equivalent to Python.
Return 'a' if the array is not floating point. Round both the real and imaginary parts separately if the array is complex. '
it is *not* done this way in Python:
round(.5) 1.0 round(1.5) 2.0 round(2.5) 3.0
( the round obj. method is missing this doc string )
Um, well, the doc is wrong. Numpy should *not* always follow python's lead, and in partciular it's explicit that Numeric's floating point is closer to IEEE floating-point behaviour than python's - compare, for example 1./0. and array(1.)/0.
I really think we should stick to what the doc string say - everybody expects x.5 to round up !!
Not everybody. This (admittedly odd) behaviour wasn't decided on because it was more efficient to implement, it was decided on because a group of very smart numerical analysts agreed that it was the best way to avoid surprising naive users with biased results. (Non-naive users can normally pick from any of several other rounding modes if they want.) A better question to ask is, "Can I change numpy's rounding behaviour for my programs?" (And, more generally, "can I set all the various floating-point options that the IEEE standard and my processor both support?") I don't know the answer to that one, but it does seem to be a goal that numpy is trying for (hence, for example, the difference between numpy float scalars and python floats). A. M. Archibald
On Mon, 4 Sep 2006 01:53:40 -0400 "A. M. Archibald" <peridot.faceted@gmail.com> wrote:
A better question to ask is, "Can I change numpy's rounding behaviour for my programs?" (And, more generally, "can I set all the various floating-point options that the IEEE standard and my processor both support?") I don't know the answer to that one, but it does seem to be a goal that numpy is trying for (hence, for example, the difference between numpy float scalars and python floats).
(looks) I think we've missed rounding. And the inexact flag. You can control how overflow, underflow, divide-by-0, and invalid-operand are handled using geterr and seterr, though. Unfortunately, getting/setting the rounding mode is hard to do in a platform-independent way :( gcc has fegetround() and fesetround() (they're part of the C99 standard, I believe). I don't know what to use on other machines (esp. Windows with MSVC), although the Boost Interval library looks like a good place to start: http://boost.cvs.sourceforge.net/boost/boost/boost/numeric/interval/detail/ -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca
Interesting, I was just reading about the round rule in IEEE standard last Friday. What numpy's "around" function does is called "round to even" (round is take to make the next digit even), instead of "round up". According to "What every computer scientist should know about floating-point arithmetic" (do a Google search), the reason to prefer "round to even" is exemplified by the following result from Reiser and Knuth: Theorem Let x and y be floating-point numbers, and define x0 = x, x1 = (x0 - y) + y, ..., xn = (xn-1 - y) + y. If + and - are exactly rounded using round to even, then either xn = x for all n or xn = x1 for all n >= 1. If you use "round up" the sequence xn can start increasing slowly "for ever", and as the paper says: "...This example suggests that when using the round up rule, computations can gradually drift upward, whereas when using round to even the theorem says this cannot happen." Best, Paulo
On 9/4/06, Paulo J. S. Silva <pjssilva@ime.usp.br> wrote:
Interesting, I was just reading about the round rule in IEEE standard last Friday.
What numpy's "around" function does is called "round to even" (round is take to make the next digit even), instead of "round up". According to "What every computer scientist should know about floating-point arithmetic" (do a Google search), the reason to prefer "round to even" is exemplified by the following result from Reiser and Knuth:
Theorem
Let x and y be floating-point numbers, and define x0 = x, x1 = (x0 - y) + y, ..., xn = (xn-1 - y) + y. If + and - are exactly rounded using round to even, then either xn = x for all n or xn = x1 for all n >= 1.
If you use "round up" the sequence xn can start increasing slowly "for ever", and as the paper says:
"...This example suggests that when using the round up rule, computations can gradually drift upward, whereas when using round to even the theorem says this cannot happen."
Best,
Paulo
However, around does set the sign bit when rounding -.5 to zero:
around(-.5).tostring() '\x00\x00\x00\x00\x00\x00\x00\x80'
so that
around(-.5) -0.0
on the other hand
around(-.5) == 0.0 True
I didn't know -0.0 was part of the IEEE spec. Chuck
Once again, the information that singed zero is part of IEEE standard is in the paper I cited in my last message. It is very important to be able to compute the sign of an overflowed quantity in expressions like 1/x when x goes to zero. Best, Paulo
Paulo J. S. Silva wrote:
Once again, the information that singed zero is part of IEEE standard is in the paper I cited in my last message.
It is very important to be able to compute the sign of an overflowed quantity in expressions like 1/x when x goes to zero.
Best,
Paulo Hi,
This is all very interesting ( and confusing (to me, maybe others) at the same time ...) ... Question 0: Are you sure this is not a bug ?
N.array([66]).round(-1) [60] N.array([66.2]).round(-1) [ 70.]
Question 1: Was this way of round already in Numeric and /or in numarray ? Question 2: Does this need to be better documented (complete and corrected(!) docstrings - maybe a dedicated wiki page ) !? This is related to "How does Matlab or IDL or others do rounding ?" - This would at least determine how many people would be surprised by this. (I would say that *if* Matlab rounds the same way, we might need less documentation ...) Thanks, Sebastian Haase
On 9/4/06, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Paulo J. S. Silva wrote:
Once again, the information that singed zero is part of IEEE standard is in the paper I cited in my last message.
It is very important to be able to compute the sign of an overflowed quantity in expressions like 1/x when x goes to zero.
Best,
Paulo Hi,
This is all very interesting ( and confusing (to me, maybe others) at the same time ...) ...
Question 0: Are you sure this is not a bug ?
N.array([66]).round(-1) [60]
That does look like a bug.
array([66]).round(-1) array([60]) array([66.]).round(-1) array([ 70.])
I suspect it is related to the integer data type of the first example. The code probably does something like this: round(66/10)*10 and 66/10 == 6 in integer arithmetic. Chuck
N.array([66.2]).round(-1) [ 70.]
Question 1: Was this way of round already in Numeric and /or in numarray ?
Don't recall. Question 2: Does this need to be better documented (complete and
corrected(!) docstrings - maybe a dedicated wiki page ) !? This is related to "How does Matlab or IDL or others do rounding ?" - This would at least determine how many people would be surprised by this. (I would say that *if* Matlab rounds the same way, we might need less documentation ...)
I have improved the document strings and the example at scipy.org. Chuck
On 9/4/06, Charles R Harris <charlesr.harris@gmail.com> wrote:
On 9/4/06, Sebastian Haase <haase@msg.ucsf.edu> wrote:
Paulo J. S. Silva wrote:
Once again, the information that singed zero is part of IEEE standard is in the paper I cited in my last message.
It is very important to be able to compute the sign of an overflowed quantity in expressions like 1/x when x goes to zero.
Best,
Paulo Hi,
This is all very interesting ( and confusing (to me, maybe others) at the same time ...) ...
Question 0: Are you sure this is not a bug ?
N.array([66]).round(-1) [60]
That does look like a bug.
array([66]).round(-1) array([60]) array([66.]).round(-1) array([ 70.])
I suspect it is related to the integer data type of the first example. The code probably does something like this:
round(66/10)*10
and 66/10 == 6 in integer arithmetic.
The problem is somewhere in this bit of code: // f will be a double f = PyFloat_FromDouble(power_of_ten(decimals)); if (f==NULL) return NULL; // op1 will be division ret = PyObject_CallFunction(op1, "OOO", a, f, out); if (ret==NULL) goto finish; // round in place. tmp = PyObject_CallFunction(n_ops.rint, "OO", ret, ret); if (tmp == NULL) {Py_DECREF(ret); ret=NULL; goto finish;} I suspect the problem is the division, which has integer 'a', double 'f', and integer 'out'. Integer out is probably the killer. Let's see: # use doubles for output
out = zeros(1) array([66]).round(decimals=-1, out=out) array([ 70.])
yep, that's the problem Chuck. <snip>
Ops, I believe you were caught by int versus float here: In [16]:around(66.0, decimals=-1) Out[16]:70.0 In [17]:around(66, decimals=-1) Out[17]:60 Note that in the int case, it seems like round is simply truncation. Paulo
participants (5)
-
A. M. Archibald
-
Charles R Harris
-
David M. Cooke
-
Paulo J. S. Silva
-
Sebastian Haase