Decimal <-> float comparisons in py3k.

Hello all, Currently in Python 2.x, Decimal-to-float comparisons behave as follows:
Decimal(1) < float(4) False Decimal(4) < float(1) False
That is, any Decimal sorts before any float (though this is arbitrary: it's possible that on some implementations any float sorts before any Decimal). This causes (a) confusion, and (b) bugs, especially when floats and Decimals are accidentally combined in a program. There probably aren't too many legitimate reasons for deliberately mixing floats and Decimals in the same calculation, so preventing accidents is the main concern here. In Python 3.x, however, such comparisons raise a TypeError ("unorderable types: Decimal() < float()"). http://bugs.python.org/issue2531 ('float compared to decimal is silently incorrect') was opened for this a while ago. I'm planning to commit a change to trunk that changes the behaviour so that the comparison result is based on the respective values of the arguments (so the results above would be True and False respectively). Question for python-dev people (and the point of this email): should this change be forward ported to py3k? On the one hand there's something to be said for maintaining a clean separation between the float and Decimal types, allowing only explicit conversions from one to the other; mixed-type arithmetic between floats and Decimals was very deliberately not permitted in the original PEP, and that's unlikely to change in a hurry. On the other hand, there's value in keeping 2.x and 3.x aligned where possible for the sake of 2-to-3 porters, and the new behaviour may even be useful. Even with the TypeError above, there are still some py3k surprises arising from the ability to compare ints and Decimals, and ints and floats, but not floats and Decimals. A quick tour of some of these surprises, in trunk:
from decimal import Decimal Decimal(1) < 2 < float(3) < Decimal(1) # < is non-transitive True Decimal(1) == 1 == float(1) # so is equality True Decimal(1) == float(1) False d1, i1, f1 = Decimal(1), float(1), 1 set([d1, i1, f1]) == set([f1, i1, d1]) # sets with the same elements are different False sorted([d1, i1, f1]) == sorted([f1, i1, d1]) False
and in py3k:
from decimal import Decimal Decimal(1) < 2 < float(3) < Decimal(1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: float() < Decimal() Decimal(1) == 1 == float(1) True Decimal(1) == float(1) False d1, i1, f1 = Decimal(1), float(1), 1 set([d1, i1, f1]) == set([f1, i1, d1]) False sorted([Decimal(1), 2, float(3)]) [Decimal('1'), 2, 3.0] sorted([2, Decimal(1), float(3)]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: float() < Decimal() sorted([float(3), 2, Decimal(1)]) [Decimal('1'), 2, 3.0]
By the way, even with the patch there are still problems with other numeric types: comparisons or set operations involving both Fraction and Decimal instances are going to cause similar difficulties to those above. In practice I think this is much less of an issue than the float/Decimal problem, since the chance of accidentally combining Fraction and Decimal types in a calculation seems significantly smaller than the chance of accidentally combining float and Decimal types. -- Mark

On Tue, Mar 16, 2010 at 9:41 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On the one hand there's something to be said for maintaining a clean separation between the float and Decimal types, allowing only explicit conversions from one to the other; mixed-type arithmetic between floats and Decimals was very deliberately not permitted in the original PEP, and that's unlikely to change in a hurry. On the other
But, to be fair, we didn't have "true value of the float at that time". I'm +0 to allow these comparisons, being "Decimal(1) < .3" the same as "Decimal(1) < Decimal.from_float(.3)" -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Tue, Mar 16, 2010 at 2:58 PM, Facundo Batista <facundobatista@gmail.com> wrote:
On Tue, Mar 16, 2010 at 9:41 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On the one hand there's something to be said for maintaining a clean separation between the float and Decimal types, allowing only explicit conversions from one to the other; mixed-type arithmetic between floats and Decimals was very deliberately not permitted in the original PEP, and that's unlikely to change in a hurry. On the other
But, to be fair, we didn't have "true value of the float at that time".
That's true. I'd still be reluctant to start supporting operations like Decimal('1.2') + 0.71, though. At least comparisons have the nice feature that the return type is unambiguous.
I'm +0 to allow these comparisons, being "Decimal(1) < .3" the same as "Decimal(1) < Decimal.from_float(.3)"
Yes, I should have clarified that those are exactly the semantics I'm proposing. Thanks! Mark

I'd say if you're not going to forward-port this to Python 3, it shouldn't go into Python 2 -- in that case it would make more sense to me to back-port the exception-raising behavior. Also supporting comparisons but not other mixed operations is going to be confusing. If you are sticking to that behavior I think mixed comparisons should also be ruled out. --Guido On Tue, Mar 16, 2010 at 7:41 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Hello all,
Currently in Python 2.x, Decimal-to-float comparisons behave as follows:
Decimal(1) < float(4) False Decimal(4) < float(1) False
That is, any Decimal sorts before any float (though this is arbitrary: it's possible that on some implementations any float sorts before any Decimal). This causes (a) confusion, and (b) bugs, especially when floats and Decimals are accidentally combined in a program. There probably aren't too many legitimate reasons for deliberately mixing floats and Decimals in the same calculation, so preventing accidents is the main concern here.
In Python 3.x, however, such comparisons raise a TypeError ("unorderable types: Decimal() < float()").
http://bugs.python.org/issue2531 ('float compared to decimal is silently incorrect') was opened for this a while ago.
I'm planning to commit a change to trunk that changes the behaviour so that the comparison result is based on the respective values of the arguments (so the results above would be True and False respectively).
Question for python-dev people (and the point of this email): should this change be forward ported to py3k?
On the one hand there's something to be said for maintaining a clean separation between the float and Decimal types, allowing only explicit conversions from one to the other; mixed-type arithmetic between floats and Decimals was very deliberately not permitted in the original PEP, and that's unlikely to change in a hurry. On the other hand, there's value in keeping 2.x and 3.x aligned where possible for the sake of 2-to-3 porters, and the new behaviour may even be useful. Even with the TypeError above, there are still some py3k surprises arising from the ability to compare ints and Decimals, and ints and floats, but not floats and Decimals.
A quick tour of some of these surprises, in trunk:
from decimal import Decimal Decimal(1) < 2 < float(3) < Decimal(1) # < is non-transitive True Decimal(1) == 1 == float(1) # so is equality True Decimal(1) == float(1) False d1, i1, f1 = Decimal(1), float(1), 1 set([d1, i1, f1]) == set([f1, i1, d1]) # sets with the same elements are different False sorted([d1, i1, f1]) == sorted([f1, i1, d1]) False
and in py3k:
from decimal import Decimal Decimal(1) < 2 < float(3) < Decimal(1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: float() < Decimal() Decimal(1) == 1 == float(1) True Decimal(1) == float(1) False d1, i1, f1 = Decimal(1), float(1), 1 set([d1, i1, f1]) == set([f1, i1, d1]) False sorted([Decimal(1), 2, float(3)]) [Decimal('1'), 2, 3.0] sorted([2, Decimal(1), float(3)]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: float() < Decimal() sorted([float(3), 2, Decimal(1)]) [Decimal('1'), 2, 3.0]
By the way, even with the patch there are still problems with other numeric types: comparisons or set operations involving both Fraction and Decimal instances are going to cause similar difficulties to those above. In practice I think this is much less of an issue than the float/Decimal problem, since the chance of accidentally combining Fraction and Decimal types in a calculation seems significantly smaller than the chance of accidentally combining float and Decimal types.
-- Mark _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)

On Tue, Mar 16, 2010 at 4:41 PM, Guido van Rossum <guido@python.org> wrote:
I'd say if you're not going to forward-port this to Python 3, it shouldn't go into Python 2 -- in that case it would make more sense to me to back-port the exception-raising behavior.
That's also a possible solution, and the one that I'd personally be happiest with. The main problem is that this has the potential to break code: lists containing both floats and Decimals are sortable in 2.6, but would no longer be sortable in 2.7. If such breakage is deemed acceptable then I'd happily backport the exception; I really don't have a good feeling for how much real-world code could break, if any.
Also supporting comparisons but not other mixed operations is going to be confusing. If you are sticking to that behavior I think mixed comparisons should also be ruled out.
Confusing, yes, but at least not bug-prone. The current 2.x behaviour has provoked complaints from a number of different people in various different fora (I recently saw this come up on StackOverflow), and after initially being skeptical I'm now convinced that it would be a good idea to change it if at all possible. Mark

On Tue, Mar 16, 2010 at 9:07 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Tue, Mar 16, 2010 at 4:41 PM, Guido van Rossum <guido@python.org> wrote:
I'd say if you're not going to forward-port this to Python 3, it shouldn't go into Python 2 -- in that case it would make more sense to me to back-port the exception-raising behavior.
That's also a possible solution, and the one that I'd personally be happiest with. The main problem is that this has the potential to break code: lists containing both floats and Decimals are sortable in 2.6, but would no longer be sortable in 2.7. If such breakage is deemed acceptable then I'd happily backport the exception; I really don't have a good feeling for how much real-world code could break, if any.
Definitely some. Stricter comparison rules are a frequent cause of problems when code is first ported to 3.x. While you'd think that code comparing a float and a Decimal is *already* broken, there's a surprising number of situations where that's not necessary the case, e.g. when an arbitrary but stable ordering is needed.
Also supporting comparisons but not other mixed operations is going to be confusing. If you are sticking to that behavior I think mixed comparisons should also be ruled out.
Confusing, yes, but at least not bug-prone. The current 2.x behaviour has provoked complaints from a number of different people in various different fora (I recently saw this come up on StackOverflow), and after initially being skeptical I'm now convinced that it would be a good idea to change it if at all possible.
Yeah, it should have raised an exception all along. But it's too late for that now. I wonder if it should just become a py3k warning? -- --Guido van Rossum (python.org/~guido)

On Tue, Mar 16, 2010 at 5:15 PM, Guido van Rossum <guido@python.org> wrote:
Definitely some. Stricter comparison rules are a frequent cause of problems when code is first ported to 3.x. While you'd think that code comparing a float and a Decimal is *already* broken, there's a surprising number of situations where that's not necessary the case, e.g. when an arbitrary but stable ordering is needed.
Hmm. Okay. It seems like backporting the exception isn't a real option, then. A nitpick: the current 2.x behaviour fails to give an arbitrary but stable ordering; it merely gives an arbitrary ordering:
sorted([Decimal(1), 2, 3.0]) [Decimal('1'), 2, 3.0] sorted([2, Decimal(1), 3.0]) [3.0, Decimal('1'), 2]
So long as your list contains only floats and Decimals you're fine, but for a list containing floats, integers and Decimals the ordering is no longer stable: the problem, of course, being that int <-> float and int <-> Decimal comparisons use a rule (compare by numeric value) that's not compatible with the way that floats and Decimals are compared. This seems like yet another possible cause of subtle bugs, and again would be fixed by the proposed change in behaviour. On the other hand, I've not seen any reports of anyone encountering this in real life. Mark Mark

On Wed, Mar 17, 2010 at 8:04 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Tue, Mar 16, 2010 at 5:15 PM, Guido van Rossum <guido@python.org> wrote:
Definitely some. Stricter comparison rules are a frequent cause of problems when code is first ported to 3.x. While you'd think that code comparing a float and a Decimal is *already* broken, there's a surprising number of situations where that's not necessary the case, e.g. when an arbitrary but stable ordering is needed.
Hmm. Okay. It seems like backporting the exception isn't a real option, then.
A nitpick: the current 2.x behaviour fails to give an arbitrary but stable ordering; it merely gives an arbitrary ordering:
sorted([Decimal(1), 2, 3.0]) [Decimal('1'), 2, 3.0] sorted([2, Decimal(1), 3.0]) [3.0, Decimal('1'), 2]
So long as your list contains only floats and Decimals you're fine, but for a list containing floats, integers and Decimals the ordering is no longer stable: the problem, of course, being that int <-> float and int <-> Decimal comparisons use a rule (compare by numeric value) that's not compatible with the way that floats and Decimals are compared. This seems like yet another possible cause of subtle bugs, and again would be fixed by the proposed change in behaviour. On the other hand, I've not seen any reports of anyone encountering this in real life.
Ok, I'll try to stay out of the discussion of which solution is best of our users, and if the outcome is that mixed operations in general are bad but mixed comparisons are good, I'll trust you. However I want to reiterate that you really shouldn't improve the situation for 2.7 unless you also forward-port the solution to 3.x. -- --Guido van Rossum (python.org/~guido)

On 3/17/2010 1:09 PM, Guido van Rossum wrote:
Ok, I'll try to stay out of the discussion of which solution is best of our users, and if the outcome is that mixed operations in general are bad but mixed comparisons are good, I'll trust you. However I want to reiterate that you really shouldn't improve the situation for 2.7 unless you also forward-port the solution to 3.x.
I agree. Improving 2.7 and not 3.2+ would give people a reason to not move to 3.x. tjr

On Mar 17, 2010, at 12:34 PM, Terry Reedy wrote:
I agree. Improving 2.7 and not 3.2+ would give people a reason to not move to 3.x.
FWIW, I think this is mischaracterizing the proposal. The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare. Py3.x is already in the best position, it refuses to compare. IOW, is already is as improved as it can get. P2.6 is in the worst position. The proposal is to make it better, but not as good as 3.x. Raymond

On Thu, 18 Mar 2010 07:44:21 am Raymond Hettinger wrote:
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Why is 3 the best? If there is a right answer to give, surely giving the right answer it is better than not? -- Steven D'Aprano

Steven D'Aprano wrote:
On Thu, 18 Mar 2010 07:44:21 am Raymond Hettinger wrote:
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Why is 3 the best? If there is a right answer to give, surely giving the right answer it is better than not?
I agree with Steven here - for mixed *arithmetic* refusing to get involved is a reasonable choice, because the whole point of Decimals is to get the answers according to base 10 expectations. Allowing implicit conversion of base 2 floats puts that foundation at risk. In what way do comparisons carry the same risk? Decimal.from_float shows that a perfectly adequate mapping from float into the Decimal space can be made, and the comparisons have a clear well-defined answer. It may be slightly confusing to those not familiar with the subtleties of limited precision base 2 vs larger precision base 10, but the boolean result places them in a clearly different category to my mind. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mar 17, 2010, at 1:59 PM, Steven D'Aprano wrote:
On Thu, 18 Mar 2010 07:44:21 am Raymond Hettinger wrote:
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Why is 3 the best? If there is a right answer to give, surely giving the right answer it is better than not?
From the early days of the decimal module, we've thought that mixed float-decimal operations are 1) a bit perilous and 2) have few, if any good use cases. Accordingly, any mixed operations should be explicit rather than implicit: Decimal('1.1') + Decimal.from_float(2.2) is better than: Decimal('1.1') + 2.2 To help the user avoid confusion, we flag the latter with a TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'. Unfortunately, in Py2.x, implicit mixed comparisons do not raise an exception, and instead will silently fail by giving an incorrect answer: >>> Decimal('1.1') < 2.2 False IMO, raising the exception is the right thing to do. Short of that though, if we're going to give a result, it should at least be a correct one. Raymond New zen: * right answers are better wrong * but ill-formed questions are best not answered at all

On Thu, 18 Mar 2010 08:58:25 am Raymond Hettinger wrote:
On Mar 17, 2010, at 1:59 PM, Steven D'Aprano wrote:
On Thu, 18 Mar 2010 07:44:21 am Raymond Hettinger wrote:
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Why is 3 the best? If there is a right answer to give, surely giving the right answer it is better than not?
From the early days of the decimal module, we've thought that mixed float-decimal operations are 1) a bit perilous and 2) have few, if any good use cases.
When it comes to *arithmetic* operations, I agree. Is there anyone on python-dev willing to argue the case for allowing implicit mixed float/Decimal arithmetic operations? The arguments in the PEP seem pretty convincing to me, and I'm not suggesting we change that. But comparison operations are different. For starters, you don't need to worry about whether to return a float or a Decimal, because you always get a bool. In theory, both Decimals and floats are representations of the same underlying thing, namely real numbers, and it seems strange to me that I can't ask whether two such real numbers are equal just because their storage implementation is different. I can see three reasonable reasons for avoiding mixed comparisons: (1) To avoid confusing float-naive users (but they're confused by pure float comparisons too). (2) To avoid mixed arithmetic operations (but comparisons aren't arithmetic). (3) If Decimals and floats compare equal, they must hash equal, and currently they don't (but Mark Dickinson thinks he has a solution for that).
Accordingly, any mixed operations should be explicit rather than implicit:
Decimal('1.1') + Decimal.from_float(2.2)
is better than:
Decimal('1.1') + 2.2
Agreed. The user should explicitly choose whether they want a float answer or a Decimal answer.
To help the user avoid confusion, we flag the latter with a TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'.
Unfortunately, in Py2.x, implicit mixed comparisons do not raise an exception, and instead will silently fail by giving
an incorrect answer: >>> Decimal('1.1') < 2.2 False
That is clearly the wrong thing to do. Do you envisage any problems from allowing this instead?
Decimal('1.1') < 2.2 True
IMO, raising the exception is the right thing to do. Short of that though, if we're going to give a result, it should at least be a correct one.
+1 -- Steven D'Aprano

On Mar 18, 2010, at 5:23 AM, Steven D'Aprano wrote:
On Thu, 18 Mar 2010 08:58:25 am Raymond Hettinger wrote:
On Mar 17, 2010, at 1:59 PM, Steven D'Aprano wrote:
On Thu, 18 Mar 2010 07:44:21 am Raymond Hettinger wrote:
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Why is 3 the best? If there is a right answer to give, surely giving the right answer it is better than not?
From the early days of the decimal module, we've thought that mixed float-decimal operations are 1) a bit perilous and 2) have few, if any good use cases.
When it comes to *arithmetic* operations, I agree. Is there anyone on python-dev willing to argue the case for allowing implicit mixed float/Decimal arithmetic operations? The arguments in the PEP seem pretty convincing to me, and I'm not suggesting we change that.
But comparison operations are different. For starters, you don't need to worry about whether to return a float or a Decimal, because you always get a bool. In theory, both Decimals and floats are representations of the same underlying thing, namely real numbers, and it seems strange to me that I can't ask whether two such real numbers are equal just because their storage implementation is different.
I can see three reasonable reasons for avoiding mixed comparisons:
(1) To avoid confusing float-naive users (but they're confused by pure float comparisons too).
(2) To avoid mixed arithmetic operations (but comparisons aren't arithmetic).
(3) If Decimals and floats compare equal, they must hash equal, and currently they don't (but Mark Dickinson thinks he has a solution for that).
Thanks for the thoughtful post. The question is which behavior is most useful, most of the time. If a person truly wants to do mixed comparisons, it's trivially easy to do so in a way that is explicit: somedecimal < Decimal.from_float(somefloat) My thought is that intentional mixed compares of float and decimal are very rare relative to unintentional cases. IOW, most of the time that x<y makes a float/decimal comparison, it is actually an error (or the user simply doesn't understand what his or her code is actually doing). That user is best served by refusing the temptation to guess that they really wanted to go down this path. The zen of this particular problem: explicit is already easy implicit is likely to let errors pass silently therefore, explicit is better than implicit the special and rare case of really wanting a float/decimal comparison is rare enough that it isn't worth making the operation implicit, especially when the explicit option is so easy. Do you want protection from probable errors or do you have compelling use cases for implicit conversion behavior where an explicit conversion would have been too much of a burden? Raymond

On Thu, Mar 18, 2010 at 5:55 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
My thought is that intentional mixed compares of float and decimal are very rare relative to unintentional cases. IOW, most of the time that x<y makes a float/decimal comparison, it is actually an error (or the user simply doesn't understand what his or her code is actually doing). That user is best served by refusing the temptation to guess that they really wanted to go down this path.
In this case, could we consider again the idea of making these comparisons produce TypeErrors in 2.7? That's what's been requested or proposed multiple times on the tracker issue (by Imri Goldberg[1], Jeremy Dunck[2], Bert Hughes[3] and Stefan Krah[4]), and also by participants in this discussion (Greg Ewing). I'm only seeing two arguments against this at the moment: (1) it has the potential to break code that relies on being able to sort heterogeneous lists. But given that heterogeneous lists can't be sorted stably anyway (see my earlier post about lists containing ints, floats and Decimals), perhaps this is an acceptable risk. (2) A few of the posters here (Steven, Nick, and me) feel that it's slightly more natural to allow these comparisons; but I think the argument's fairly evenly balanced at the moment between those who'd prefer an exception and those who'd prefer to allow the comparisons. I'd really like to get this sorted for 2.7: as far as I'm concerned, either of the proposed behaviours (raise an exception, or allow comparisons) would be an improvement on the current 2.7 behaviour. Could everyone live with making float<->Decimal comparisons raise an exception in 2.7? Mark [1] http://bugs.python.org/issue2531#msg83691 [2] http://bugs.python.org/issue2531#msg83818 [3] http://bugs.python.org/issue2531#msg97891 [4] http://bugs.python.org/issue2531#msg98217

On Fri, 19 Mar 2010 05:41:08 am Mark Dickinson wrote:
I'd really like to get this sorted for 2.7: as far as I'm concerned, either of the proposed behaviours (raise an exception, or allow comparisons) would be an improvement on the current 2.7 behaviour.
Could everyone live with making float<->Decimal comparisons raise an exception in 2.7?
Yes. I would far prefer an exception than the current incorrect results in 2.6. -- Steven D'Aprano

Mark Dickinson wrote:
Could everyone live with making float<->Decimal comparisons raise an exception in 2.7?
I could, with the caveat that *if* this causes problems for real world code, then changing it to produce the correct answer (as per your patch) should be applied as a bug fix in both 2.7 and 3.2. Note that even in Py3k there are some fairly serious weirdnesses kicking around due to the intransitive nature of numeric equality though:
from decimal import Decimal as dec set((1, 1.0, dec("1.0"))) {1} set((1.0, dec("1.0"))) {1.0, Decimal('1.0')}
d = {} from decimal import Decimal as dec d[1] = d[1.0] = d[dec("1.0")] = 42 d {1: 42} d[1.0] = d[dec("1.0")] = 42 d {1: 42} del d[1] d[1.0] = d[dec("1.0")] = 42 d {1.0: 42, Decimal('1.0'): 42}
When there is a clear, correct way (based on Decimal.from_float) to make numeric comparison behave in accordance with the rules of mathematics, do we really want to preserve strange, unintuitive behaviour like the above? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On 3/18/2010 2:48 PM, Nick Coghlan wrote:
When there is a clear, correct way (based on Decimal.from_float) to make numeric comparison behave in accordance with the rules of mathematics, do we really want to preserve strange, unintuitive behaviour like the above?
Cheers, Nick.
I'm aware of nothing that prevents the lazy coder from having a class unifiedNumber in his toolbox that implements his favorite type of conversions from various numeric types to whatever he thinks is the "best" one for his application, and then using it places where sources might be of various other numeric types. I'm aware of nothing that would prevent the lazy coder from implement comparison operators and even arithmetic operators on such a class. I believe, but haven't proven, and haven't used Python long enough to "just know", that such class could even implement what would appear to be operators that would seem to provide implicit conversion from the various other numeric classes, as long as one item was of the class. So I think it would be possible, with such a class, to have one's cake (nothing implicit), and eat it too (providing a way do comparisons and numerical operations, mostly implicitly). Glenn

Glenn Linderman <v+python <at> g.nevcal.com> writes:
On 3/18/2010 2:48 PM, Nick Coghlan wrote:
When there is a clear, correct way (based on Decimal.from_float) to make numeric comparison behave in accordance with the rules of mathematics, do we really want to preserve strange, unintuitive behaviour like the above?
I'm aware of nothing that prevents the lazy coder from having a class unifiedNumber in his toolbox [snip]
Please stick to the topic. We are talking about Python's default behaviour here. Antoine.

On 3/18/2010 6:18 PM, Antoine Pitrou wrote:
Glenn Linderman<v+python<at> g.nevcal.com> writes:
On 3/18/2010 2:48 PM, Nick Coghlan wrote:
When there is a clear, correct way (based on Decimal.from_float) to make numeric comparison behave in accordance with the rules of mathematics, do we really want to preserve strange, unintuitive behaviour like the above?
I'm aware of nothing that prevents the lazy coder from having a class unifiedNumber in his toolbox [snip]
Please stick to the topic. We are talking about Python's default behaviour here.
Yes, I consider my comment relevant, and think that you should apologize for claiming it is off topic. There are two choices on the table -- doing comparisons implicitly between Decimal and float, and raising an exception. It seems the current behavior, sorting by type, is universally disliked, but doing nothing is a third choice. So if the default behavior is to raise an exception, my comment pointed out the way comparisons could be provided, for those that need them. This allows both behaviors to exist concurrently. Python developers could even consider including such a library in the standard library, although my suggestion didn't include that. On the other hand, if the default behavior is to do an implicit conversion, I don't know of any way that that could be turned into an exception for those coders that don't want or don't like the particular type of implicit conversion chosen. Glenn

Glenn Linderman <glenn <at> nevcal.com> writes:
On the other hand, if the default behavior is to do an implicit conversion, I don't know of any way that that could be turned into an exception for those coders that don't want or don't like the particular type of implicit conversion chosen.
You still haven't given a good reason why we should raise an exception rather than do a comparison, though. The fact that some particular coders don't like "the particular type of implicit conversion chosen" is a particularly weak argument. Python isn't a language construction framework; we try to choose useful defaults rather than simply give out a box of tools. If some people don't like the defaults (significant indentation, limitless precision integers, etc.), there are other choices out there. Antoine.

On Mar 19, 2010, at 4:50 AM, Antoine Pitrou wrote:
Glenn Linderman <glenn <at> nevcal.com> writes:
On the other hand, if the default behavior is to do an implicit conversion, I don't know of any way that that could be turned into an exception for those coders that don't want or don't like the particular type of implicit conversion chosen.
You still haven't given a good reason why we should raise an exception rather than do a comparison, though.
The fact that some particular coders don't like "the particular type of implicit conversion chosen" is a particularly weak argument. Python isn't a language construction framework; we try to choose useful defaults rather than simply give out a box of tools. If some people don't like the defaults (significant indentation, limitless precision integers, etc.), there are other choices out there.
The reason to prefer an exception is that decimal/float comparisons are more likely to be a programmer error than an intended behavior. Real use cases for decimal/float comparisons are rare and would be more clearly expressed with an explicit conversion using Decimal.from_float(). Of course there is a precedent, I can compare "120" < 140 in AWK and get an automatic implicit conversion ;-) Just because we can compare, doesn't mean we should. Raymond

Raymond Hettinger <raymond.hettinger <at> gmail.com> writes:
The reason to prefer an exception is that decimal/float comparisons are more likely to be a programmer error than an intended behavior.
Not more so than float/int or decimal/int or bool/int comparisons, which all work. We forbid comparisons when there is a real danger or ambiguity, such as unicode vs. bytes. There is no such danger or ambiguity when comparing a decimal with a float. I don't see the point of being so restrictive; Python is not Java, and typing is not supposed to be a synonym for bondage.
Of course there is a precedent, I can compare "120" < 140 in AWK and get an automatic implicit conversion
The proper precedent in this context, though, is this one (py3k):
1 < 2.0 True 1 < Decimal("2.0") True 1 > Decimal("2.0") False 1 > 2.0 False True > 0.5 True True > 1.5 False True > Decimal("0.5") True True > Decimal("1.5") False
Are you suggesting to change all the above comparisons to raise a TypeError? cheers Antoine.

On Mar 19, 2010, at 11:11 AM, Antoine Pitrou wrote:
Raymond Hettinger <raymond.hettinger <at> gmail.com> writes:
The reason to prefer an exception is that decimal/float comparisons are more likely to be a programmer error than an intended behavior.
Not more so than float/int or decimal/int or bool/int comparisons, which all work.
The float/int and decimal/int comparisons have valid and common use cases. In contrast, comparing binary floats and decimal floats is rare, and more likely to be an accident. When an int is converted to a float or decimal, the result usually isn't surprising. The conversion of a float to a decimal is not as straight-forward (most people don't even know that an exact conversion is possible). Also, the float and decimal constructors also natively accept ints as inputs, but the decimal constructor intentionally excludes a float input. Raymond

Raymond Hettinger <raymond.hettinger <at> gmail.com> writes:
The conversion of a float to a decimal is not as straight-forward (most people don't even know that an exact conversion is possible).
I still don't follow you. You are saying that an exact conversion is possible, but don't want it to be done because it is "not as straight-forward"?

Antoine, I think your email client turns replies whose subject contains '&' into new subjects containing sequences like this: & amp; lt; -& amp; gt; There's a number of new threads with ever-more occurrences of "amp" in the subject, and you are the first poster in each thread (and the first post's subject already starts with "Re:"). -- --Guido van Rossum (python.org/~guido)

Guido van Rossum <guido <at> python.org> writes:
Antoine, I think your email client turns replies whose subject contains '&' into new subjects containing sequences like this:
& amp; lt; -& amp; gt;
There's a number of new threads with ever-more occurrences of "amp" in the subject, and you are the first poster in each thread (and the first post's subject already starts with "Re:").
Hmm, indeed. It's the gmane Web posting interface, actually. Regards Antoine.

On 3/19/2010 2:11 PM, Antoine Pitrou wrote:
Raymond Hettinger<raymond.hettinger<at> gmail.com> writes:
The reason to prefer an exception is that decimal/float comparisons are more likely to be a programmer error than an intended behavior.
If you really believe that, then equality comparisons should also be disabled by raising NotImplemented or whatever. Clearly, someone who writes 'if somefloat == somedecimal:'assumes (now wrongly) that the test might be true. This is just as buggy as an order comparison. Raising an exception would consistently isolate decimals from other numbers and eliminate the equality intransitivity mess and its nasty effect on sets. It still strikes me as a bit crazy for Python to say that 0.0 == 0 and 0 == Decimal(0) but that 0.0 != Decimal(0). Who would expect such a thing? Terry Jan Reedy

On 3/19/2010 11:43 AM, Terry Reedy wrote:
On 3/19/2010 2:11 PM, Antoine Pitrou wrote:
Raymond Hettinger<raymond.hettinger<at> gmail.com> writes:
The reason to prefer an exception is that decimal/float comparisons are more likely to be a programmer error than an intended behavior.
If you really believe that, then equality comparisons should also be disabled by raising NotImplemented or whatever.
Totally agree. While the example most used in this thread is a less than operator, the text of the thread has never (as far as I have read) distinguished between equality, inequality, or relative signed magnitude comparisons. Sort has also been mentioned explicitly as an example. I perceive that the whole thread is about _all_ comparison operators with one float and one decimal operand currently producing an exception (3.x) or a type-based ordering (2.x). The type-based ordering has been demonstrated to produce unstable sorts in the presence of other types also, so is more of a problem than first perceived, and should be changed.
Clearly, someone who writes 'if somefloat == somedecimal:'assumes (now wrongly) that the test might be true. This is just as buggy as an order comparison. Raising an exception would consistently isolate decimals from other numbers and eliminate the equality intransitivity mess and its nasty effect on sets.
Totally agree.
It still strikes me as a bit crazy for Python to say that 0.0 == 0 and 0 == Decimal(0) but that 0.0 != Decimal(0). Who would expect such a thing?
The same person that would expect both 0 == "0" 0.0 == "0.0" to be False... i.e. anyone that hasn't coded in Perl for too many years. Glenn

On Fri, 19 Mar 2010 12:16:35 -0700, Glenn Linderman <v+python@g.nevcal.com> wrote:
On 3/19/2010 11:43 AM, Terry Reedy wrote:
It still strikes me as a bit crazy for Python to say that 0.0 == 0 and 0 == Decimal(0) but that 0.0 != Decimal(0). Who would expect such a thing?
The same person that would expect both
0 == "0" 0.0 == "0.0"
to be False... i.e. anyone that hasn't coded in Perl for too many years.
No, those two situations are not comparable. "if a equals b and b equals c, then a equals c" is what most people will expect. In fact, only mathematicians are likely to know that any other situation could possibly be valid. Programmers are generally going to view it as a bug. -- R. David Murray www.bitdance.com

On 3/19/2010 3:16 PM, Glenn Linderman wrote:
I perceive that the whole thread is about _all_ comparison operators with one float and one decimal operand currently producing an exception (3.x)
Not true for equality comparison. That returns False regardless of value, just as order comparisons return what they do in 2.x regardless of value. I claim that the former is at least as bad as the latter for numbers.
or a type-based ordering (2.x).
3.x mixes type-based and value-based equality testing for Decimals and other numbers. Terry Jan Reedy

Glenn Linderman wrote:
The same person that would expect both
0 == "0" 0.0 == "0.0"
to be False... i.e. anyone that hasn't coded in Perl for too many years.
Completely different - that is comparing numbers to strings. What this discussion is about is comparison between the different kinds of real number in the interpreter core and standard library (ints, floats, decimal.Decimal, fractions.Fraction). For arithmetic, decimal.Decimal deliberately fails to interoperate with binary floating point numbers (hence why it isn't registered as an example of the numbers.Real ABC). For comparison, it only sort of fails to interoperate - it gives either an exception or silently returns the wrong answer, depending on Python version and the exact operator involved. Fractions are nowhere near as purist about things - they will happily allow themselves to be implicitly converted to floating point values. So currently (for arithmetic): int op int -> int (or potentially float due to division) int op float, float op int -> float int op Fraction, Fraction op int -> Fraction int op Decimal, Decimal op int -> Decimal Fraction op Fraction -> Fraction Fraction op float, float op Fraction -> float Fraction op Decimal, Decimal op Fraction -> TypeError Decimal op Decimal -> Decimal Decimal op float, float op Decimal -> TypeError float op float -> float Nobody is arguing in this thread for any changes to that. I just want to contrast it with the situation for the comparison operators: int op int -> bool int op float, float op int -> bool int op Fraction, Fraction op int -> bool int op Decimal, Decimal op int -> bool Fraction op Fraction -> bool Fraction op float, float op Fraction -> bool Fraction op Decimal, Decimal op Fraction -> TypeError Decimal op Decimal -> bool Decimal op float, float op Decimal -> TypeError float op float -> bool In the case of floats and Decimals, there's no ambiguity here that creates any temptation to guess - to determine a true/false result for a comparison, floats can be converted explicitly to Decimals without any loss of accuracy. For Fractions, the precedent has already been set by allowing implicit (potentially lossy) conversion to binary floats - a lossy conversion to Decimal wouldn't be any worse. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On 3/19/2010 9:20 PM, Nick Coghlan wrote:
Glenn Linderman wrote:
The same person that would expect both
0 == "0" 0.0 == "0.0"
to be False... i.e. anyone that hasn't coded in Perl for too many years.
Completely different - that is comparing numbers to strings.
One can argue that either way, that it is completely different, or completely the same. Is the map the territory, or is the territory the map? The human representation of the numbers in string form is meaningful numerically, and there are explicit conversions between them that prove that it is so. But it is completely different, because Python doesn't do implicit coercion of strings to numbers, so they don't belong in the tree. But it is completely the same, because Python doesn't do implicit coercion of Decimal to numbers, so they don't belong in the tree. Completely different, because Python also doesn't do numeric operations on strings either... so the analogy only goes so far, admittedly. Even Perl, which will implicitly coerce string to numbers, has very limited numeric operations that are performed on the string form. Thanks, though for the nice chart from an internals guy, you confirmed all that I thought I knew about how this presently works, from PEP and manual reading, a bit of experimentation, and the discussions here. I have no problem with Guido's latest proposal in the rebooted thread, as long as there is a way to wall off the Decimal type from creeping errors due to implicit conversions and inaccurate types. It is presently an excellent feature of Python to have a way to be sure that conversions are explicit to retain proper accuracy. Few languages have that without requiring the user to write the whole class himself (or beg, borrow, or buy it). As he implicitly points out, while declaring that mixed arithmetic might be appropriate, that it either needs to be done correctly and completely, or not at all. Glenn

Glenn Linderman wrote:
One can argue that either way, that it is completely different, or completely the same.
An important difference is that there is no intermediate type that can be compared with both ints and strings. Another relevant difference is that numbers are just one of many possible things that a string could be interpreted as representing. Decimals, on the other hand, are just numbers and nothing else. (Even strings which happen to look like numbers might not be -- e.g. telephone "numbers", which are better thought of as strings with a limited character set... despite some misguided souls occasionally storing them in floating point. :-) -- Greg

Glenn Linderman wrote:
Thanks, though for the nice chart from an internals guy, you confirmed all that I thought I knew about how this presently works, from PEP and manual reading, a bit of experimentation, and the discussions here.
I'll confess to a bit of interpreter prompt experimentation to confirm the behaviour, especially in the relation to Fractions - I didn't know all of that off the top of my head. The implicit coercion from Fraction -> float definitely surprised me a bit. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Fri, Mar 19, 2010 at 6:43 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 3/19/2010 2:11 PM, Antoine Pitrou wrote:
Raymond Hettinger<raymond.hettinger<at> gmail.com> writes:
The reason to prefer an exception is that decimal/float comparisons are more likely to be a programmer error than an intended behavior.
If you really believe that, then equality comparisons should also be disabled by raising NotImplemented or whatever.
Hah. This is a very good point, and one I'd somehow missed up until now. I don't think we *can* reasonably make equality comparisons raise NotImplemented (in either 2.x or 3.x), since that messes up containment tests: something like "1.0 in {2, Decimal(3)}" would raise a TypeError instead of correctly returning False. (So the patch I've put on the issue tracker is wrong, since it does raise TypeError for equality and inequality, as well as for <, >, <= and >=.)
Clearly, someone who writes 'if somefloat == somedecimal:'assumes (now wrongly) that the test might be true. This is just as buggy as an order comparison. Raising an exception would consistently isolate decimals from other numbers and eliminate the equality intransitivity mess and its nasty effect on sets.
It still strikes me as a bit crazy for Python to say that 0.0 == 0 and 0 == Decimal(0) but that 0.0 != Decimal(0). Who would expect such a thing?
Agreed. A solution to the original problem that still has 0.0 == Decimal(0) evaluating to False isn't much of a solution. This puts me firmly in the 'make numeric comparisons give the right answer' camp. Thanks, Mark

On Fri, Mar 19, 2010 at 7:50 PM, Mark Dickinson <dickinsm@gmail.com> wrote:
So the patch I've put on the issue tracker is wrong, since it does raise TypeError ...
s/I've put/I have yet to put/ I really shouldn't admit to errors in things that I haven't even been made public yet. :) Mark

On 3/19/2010 12:50 PM, Mark Dickinson wrote:
Hah. This is a very good point, and one I'd somehow missed up until now. I don't think we*can* reasonably make equality comparisons raise NotImplemented (in either 2.x or 3.x), since that messes up containment tests: something like "1.0 in {2, Decimal(3)}" would raise a TypeError instead of correctly returning False. (So the patch I've put on the issue tracker is wrong, since it does raise TypeError for equality and inequality, as well as for<,>,<= and>=.)
Sounds to me like containment checking is wrong; that if it gets an exception during the comparison that it should assume unequal, rather than aborting, and continue to the next entry. Wouldn't the same issue arise for containment tests when disparate, incomparable objects are included in the same set? Why is this different?

Glenn Linderman <v+python <at> g.nevcal.com> writes:
Sounds to me like containment checking is wrong; that if it gets an exception during the comparison that it should assume unequal, rather than aborting, and continue to the next entry.
Well as the Zen says: Errors should never pass silently. Unless explicitly silenced. If there's a bug in your __eq__ method, rather than getting wrong containment results, you get the proper exception.
Wouldn't the same issue arise for containment tests when disparate, incomparable objects are included in the same set? Why is this different?
That's why equality testing almost never fails between standard types. We do have an (IMO unfortunate) exception in the datetime module, though:
from datetime import tzinfo, timedelta, datetime ZERO = timedelta(0) class UTC(tzinfo): # UTC class ripped off from the official doc ... """UTC""" ... def utcoffset(self, dt): ... return ZERO ... def tzname(self, dt): ... return "UTC" ... def dst(self, dt): ... return ZERO ... utc = UTC() a = datetime.now() b = datetime.now(utc) a == b Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't compare offset-naive and offset-aware datetimes
Regards Antoine.

On 3/19/2010 3:02 PM, Antoine Pitrou wrote:
Glenn Linderman<v+python<at> g.nevcal.com> writes:
Sounds to me like containment checking is wrong; that if it gets an exception during the comparison that it should assume unequal, rather than aborting, and continue to the next entry.
Well as the Zen says:
Errors should never pass silently. Unless explicitly silenced.
If there's a bug in your __eq__ method, rather than getting wrong containment results, you get the proper exception.
If there's a bug in your __eq__ method, it may or may not raise an exception, which may or may not get you wrong containment results. But it will probably get you buggy results, somehow or another. That's what design, code reviews, and testing are for. If the your __eq__ method uses exceptions (the only available method of out-of-band signalling for binary operators; not all exceptions are errors) to declare that it can't perform the comparison and produce a boolean result, that is a case where producing an exception is not an error, so your quoted Zen doesn't apply. Glenn

Glenn Linderman <v+python <at> g.nevcal.com> writes:
If there's a bug in your __eq__ method, it may or may not raise an exception, which may or may not get you wrong containment results. But it will probably get you buggy results, somehow or another. That's what design, code reviews, and testing are for.
We'll have to "agree to disagree" then. If you want error silencing by default, Python is not the language you are looking for.

On 3/19/2010 4:58 PM, Antoine Pitrou wrote:
Glenn Linderman<v+python<at> g.nevcal.com> writes:
If there's a bug in your __eq__ method, it may or may not raise an exception, which may or may not get you wrong containment results. But it will probably get you buggy results, somehow or another. That's what design, code reviews, and testing are for.
We'll have to "agree to disagree" then. If you want error silencing by default, Python is not the language you are looking for.
We can agree to disagree, if you like. But taken to the limit, the Zen you quoted would prevent the try except clause from being used.

On 20/03/2010 00:15, Glenn Linderman wrote:
On 3/19/2010 4:58 PM, Antoine Pitrou wrote:
Glenn Linderman<v+python<at> g.nevcal.com> writes:
If there's a bug in your __eq__ method, it may or may not raise an exception, which may or may not get you wrong containment results. But it will probably get you buggy results, somehow or another. That's what design, code reviews, and testing are for. We'll have to "agree to disagree" then. If you want error silencing by default, Python is not the language you are looking for.
We can agree to disagree, if you like. But taken to the limit, the Zen you quoted would prevent the try except clause from being used.
No, that is what "unless explicitly silenced" means - you are proposing to silence them *without* an explicit try except clause. Michael
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On 3/19/2010 5:18 PM, Michael Foord wrote:
will probably get you buggy results, somehow or another. That's what
design, code reviews, and testing are for. We'll have to "agree to disagree" then. If you want error silencing by default, Python is not the language you are looking for.
We can agree to disagree, if you like. But taken to the limit, the Zen you quoted would prevent the try except clause from being used.
No, that is what "unless explicitly silenced" means - you are proposing to silence them *without* an explicit try except clause.
Michael
Who, me? The containment checking code would contain the try/except, I was proposing. Glenn

On 20/03/2010 00:19, Glenn Linderman wrote:
On 3/19/2010 5:18 PM, Michael Foord wrote:
will probably get you buggy results, somehow or another. That's what
design, code reviews, and testing are for. We'll have to "agree to disagree" then. If you want error silencing by default, Python is not the language you are looking for.
We can agree to disagree, if you like. But taken to the limit, the Zen you quoted would prevent the try except clause from being used.
No, that is what "unless explicitly silenced" means - you are proposing to silence them *without* an explicit try except clause.
Michael
Who, me? The containment checking code would contain the try/except, I was proposing.
Explicit by the programmer. That is what explicit means... Caught and silenced for you by Python is implicit. Michael
Glenn
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On 3/19/2010 5:20 PM, Michael Foord wrote:
On 20/03/2010 00:19, Glenn Linderman wrote:
On 3/19/2010 5:18 PM, Michael Foord wrote:
will probably get you buggy results, somehow or another. That's what > design, code reviews, and testing are for. We'll have to "agree to disagree" then. If you want error silencing by default, Python is not the language you are looking for.
We can agree to disagree, if you like. But taken to the limit, the Zen you quoted would prevent the try except clause from being used.
No, that is what "unless explicitly silenced" means - you are proposing to silence them *without* an explicit try except clause.
Michael
Who, me? The containment checking code would contain the try/except, I was proposing.
Explicit by the programmer. That is what explicit means... Caught and silenced for you by Python is implicit.
Should I really believe that there are no try/except clauses in the Python source code (nor their C equivalent, if (errno == blahblahblah) ... )? I mean, I haven't read very much of the source code... but that statement makes me want to download and grep...

Glenn Linderman wrote:
Sounds to me like containment checking is wrong; that if it gets an exception during the comparison that it should assume unequal, rather than aborting, and continue to the next entry.
What exception would it catch, though? Catching something as generic as TypeError would be a very bad idea, I think -- there would be too much potential for it to mask bugs. It might be acceptable if there were a special subclass of TypeError for this, such as NotComparableError. What happens in Py3 here, BTW? It must have the same problem if it refuses to compare some things for equality. -- Greg

Antoine Pitrou wrote:
We forbid comparisons when there is a real danger or ambiguity, such as unicode vs. bytes. There is no such danger or ambiguity when comparing a decimal with a float.
So do you think that float("0.1") and Decimal("0.1") should be equal or not, and why? -- Greg

Greg Ewing wrote:
Antoine Pitrou wrote:
We forbid comparisons when there is a real danger or ambiguity, such as unicode vs. bytes. There is no such danger or ambiguity when comparing a decimal with a float.
So do you think that float("0.1") and Decimal("0.1") should be equal or not, and why?
Note that Antoine's point was that float("0.1") and Decimal.from_float(0.1) should compare equal. The latter exactly matches the underlying binary floating point value:
dec.from_float(0.1) Decimal('0.1000000000000000055511151231257827021181583404541015625')
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Note that Antoine's point was that float("0.1") and Decimal.from_float(0.1) should compare equal.
That would mean that Decimal("0.1") != float("0.1"), which might be surprising to someone who didn't realise they were mixing floats and decimals. -- Greg

On Sat, Mar 20, 2010 at 22:48, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
Note that Antoine's point was that float("0.1") and Decimal.from_float(0.1) should compare equal.
That would mean that Decimal("0.1") != float("0.1"), which might be surprising to someone who didn't realise they were mixing floats and decimals.
Much like Fraction("0.1") != float("0.1") is surprising. The only way to get rid of surprises around float is to get rid of float, and that ain't happening. -- Adam Olsen, aka Rhamphoryncus

Greg Ewing wrote:
Nick Coghlan wrote:
Note that Antoine's point was that float("0.1") and Decimal.from_float(0.1) should compare equal.
That would mean that Decimal("0.1") != float("0.1"), which might be surprising to someone who didn't realise they were mixing floats and decimals.
That's fine - binary floats *are* surprising. That's why Decimal exists in the first place. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Sat, Mar 20, 2010 at 11:59 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Greg Ewing wrote:
Nick Coghlan wrote:
Note that Antoine's point was that float("0.1") and Decimal.from_float(0.1) should compare equal.
That would mean that Decimal("0.1") != float("0.1"), which might be surprising to someone who didn't realise they were mixing floats and decimals.
That's fine - binary floats *are* surprising. That's why Decimal exists in the first place.
Decimals can be just as surprising:
Decimal(1) / Decimal(3) * Decimal(3) == Decimal(1) False
-- --Guido van Rossum (python.org/~guido)

Nick Coghlan wrote:
That's fine - binary floats *are* surprising. That's why Decimal exists in the first place.
This argument could equally well be used the other way -- someone using Decimal is doing so precisely because they *don't* want to be surprised, in which case they would probably prefer to get an exception. The fundamental problem here is that there are two possible reasons for a mixed float-decimal operation: 1) The user is thinking in terms of floats and has happened to get a Decimal mixed in somehow. 2) The user is thinking in terms of Decimals and has happened to get a float mixed in somehow. There is no way of distinguishing between these automatically. -- Greg

On Mon, 22 Mar 2010 08:47:53 am Greg Ewing wrote:
Nick Coghlan wrote:
That's fine - binary floats *are* surprising. That's why Decimal exists in the first place.
This argument could equally well be used the other way -- someone using Decimal is doing so precisely because they *don't* want to be surprised, in which case they would probably prefer to get an exception.
Then they're in for a terrible, terrible disappointment. Rounding issues don't go away because you're using Decimal instead of float, and I can't imagine anyone would like an exception in the following cases:
Decimal(1)/Decimal(3)*Decimal(3) == Decimal(1) False Decimal(2).sqrt()**Decimal(2) == Decimal(2) False Decimal(10**28)+Decimal(1)-Decimal(10**28) == Decimal(1) False
Rounding isn't the only surprise:
x = Decimal("NAN"); x == x False
Decimals are floats, but using radix 10 instead of radix 2. -- Steven D'Aprano

Steven D'Aprano wrote:
Then they're in for a terrible, terrible disappointment. Rounding issues don't go away because you're using Decimal instead of float,
No, but what I mean is that they prefer to be surprised in unsurprising ways, so to speak. Everyone knows that floating point numbers have limited precision. What really surprises people is when *binary* floating point numbers behave differently from the decimal ones they're used to. -- Greg

On 3/19/2010 4:50 AM, Antoine Pitrou wrote:
Glenn Linderman<glenn<at> nevcal.com> writes:
On the other hand, if the default behavior is to do an implicit conversion, I don't know of any way that that could be turned into an exception for those coders that don't want or don't like the particular type of implicit conversion chosen.
You still haven't given a good reason why we should raise an exception rather than do a comparison, though.
The fact that some particular coders don't like "the particular type of implicit conversion chosen" is a particularly weak argument. Python isn't a language construction framework; we try to choose useful defaults rather than simply give out a box of tools. If some people don't like the defaults (significant indentation, limitless precision integers, etc.), there are other choices out there.
The whole point of providing Decimal is for applications for which float is inappropriate. I didn't think I needed to reproduce PEP 327 in my email. So when a coder choose to use Decimal, it is because float is inappropriate. Because float is inappropriate, mixing Decimal and float is inappropriate. Having the language coerce implicitly, is inappropriate. All this is in the PEP. Note that the PEP actually says that the problem is not doing implicit arithmetic (as some have reported in this thread) but in doing implicit coercions. In order to do implicit comparisons, one must do an implicit coercion. Hence the PEP actually already prohibits implicit comparisons, as well as implicit arithmetic. Now the proposal is to start down the slippery slope by allowing comparisons. To start with, neither decimal nor float comparisons, are, in general, exact. (Although it is more common for decimal calculations to contain the code to do rounding and significant digit calculations than float code, due to its use in monetary calculations. People that don't care about the precision tend to just use float, and no significant digit concerns are coded.) Comparisons need to be done with full knowledge of the precision of the numbers. The additional information necessary to do so cannot be encoded in a binary operator. People that do care about precision would rather not have imprecise data introduced by accidentally forgetting to include a coercion constructor, and have it implicitly converted; hence, the exception is much better for them. My original message pointed out that providing an exception solves the problem for people that care about precision, and doesn't hinder a solution (only a convenience) for people that somehow bothered to use Decimal but then suddenly lost interest in doing correct math. For those people, the class I proposed can workaround the existence of the exception, so the language can better serve both types of people, those that care, and those that don't. My personal opinion is that real applications that use Decimal are much more likely to care, and much more likely to appreciate the exception, whereas applications that don't care, aren't likely to use Decimal in the first place, and so won't encounter the problem anyway. And if they do, for some reason, hop on the Decimal bandwagon, then it seems to be a simple matter to implement a class with implicit conversions using whatever parameters they choose to implement, for their sloppy endeavors. Yes, if implicit coercions for comparisons exist, it is possible to write code that avoids using them... but much friendlier for people that wish to avoid them, if the exception remains in place, unless explicitly sidestepped through coding another class. Implementing an exception for Decimal/float comparison attempts, rather than comparing their types, is a giant step forward. Implementing an implicit coercion for Decimal/float comparisons is a step down a slippery slope, to reduce the benefits of Decimal for its users. Glenn

On Mar 19, 2010, at 11:42 AM, Glenn Linderman wrote:
The whole point of providing Decimal is for applications for which float is inappropriate. I didn't think I needed to reproduce PEP 327 in my email.
So when a coder choose to use Decimal, it is because float is inappropriate. Because float is inappropriate, mixing Decimal and float is inappropriate. Having the language coerce implicitly, is inappropriate. All this is in the PEP. Note that the PEP actually says that the problem is not doing implicit arithmetic (as some have reported in this thread) but in doing implicit coercions. In order to do implicit comparisons, one must do an implicit coercion. Hence the PEP actually already prohibits implicit comparisons, as well as implicit arithmetic.
Well said. Raymond

Glenn Linderman <v+python <at> g.nevcal.com> writes:
So when a coder choose to use Decimal, it is because float is inappropriate. Because float is inappropriate, mixing Decimal and float is inappropriate. Having the language coerce implicitly, is inappropriate.
I'm sorry but this is very dogmatic. What is the concrete argument against an accurate comparison between floats and decimals?
Comparisons need to be done with full knowledge of the precision of the numbers. The additional information necessary to do so cannot be encoded in a binary operator.
This doesn't have anything to do with the mixing of floats and decimals, though, since it also applies to unmixed comparisons. Again, is there an argument specific to mixed comparisons?

Glenn Linderman wrote:
In order to do implicit comparisons, one must do an implicit coercion. Hence the PEP actually already prohibits implicit comparisons, as well as implicit arithmetic.
Not necessarily -- you could compare them as though they had both been converted to the equivalent rational numbers, which can be done without loss of precision. -- Greg

On 3/18/2010 5:48 PM, Nick Coghlan wrote:
Mark Dickinson wrote:
Could everyone live with making float<->Decimal comparisons raise an exception in 2.7?
I could, with the caveat that *if* this causes problems for real world code, then changing it to produce the correct answer (as per your patch) should be applied as a bug fix in both 2.7 and 3.2.
Note that even in Py3k there are some fairly serious weirdnesses kicking around due to the intransitive nature of numeric equality though:
from decimal import Decimal as dec set((1, 1.0, dec("1.0"))) {1} set((1.0, dec("1.0"))) {1.0, Decimal('1.0')}
d = {} from decimal import Decimal as dec d[1] = d[1.0] = d[dec("1.0")] = 42 d {1: 42} d[1.0] = d[dec("1.0")] = 42 d {1: 42} del d[1] d[1.0] = d[dec("1.0")] = 42 d {1.0: 42, Decimal('1.0'): 42}
When there is a clear, correct way (based on Decimal.from_float) to make numeric comparison behave in accordance with the rules of mathematics, do we really want to preserve strange, unintuitive behaviour like the above?
I would still strongly prefer that equality intransitivity be fixed one way or another (I suggested two on the tracker). The many ways it makes sets misbehave was discussed in both of http://mail.python.org/pipermail/python-list/2008-September/508859.html http://bugs.python.org/issue4087 Terry Jan Reedy

On Thu, Mar 18, 2010 at 9:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Note that even in Py3k there are some fairly serious weirdnesses kicking around due to the intransitive nature of numeric equality though:
Yep. My personal favourite is:
from decimal import Decimal as dec set((1, 1.0, dec(1))) == set((1.0, 1, dec(1))) False
(The sets even have different sizes!) Note that while the originally proposed change does fix problems with sets of Decimals, ints and floats, and with sorting of lists of those types, it's not a complete fix: as soon as you throw Fractions into the mix, all the same problems arise again. Making hashes of int, float, Decimal *and* Fraction all compatible with one another, efficient for ints and floats, and not grossly inefficient for Fractions and Decimals, is kinda hairy, though I have a couple of ideas of how it could be done. Making Decimal-to-Fraction comparisons give correct results isn't easy either: the conversion in one direction (Fraction to Decimal) is lossy, while the conversion in the other direction (Decimal to Fraction) can't be performed efficiently if the Decimal has a large exponent (Decimal('1e123456')); so you can't just do the Decimal.from_float trick of whacking everything into a single domain and comparing there. Again, this is solvable, but not trivially so. See, this is what happens when a language conflates numeric equality with the equivalence relation used for membership testing. ;-). (I'm not really complaining, but this is definitely one of the downsides of this particular design decision.)
When there is a clear, correct way (based on Decimal.from_float) to make numeric comparison behave in accordance with the rules of mathematics, do we really want to preserve strange, unintuitive behaviour like the above?
No. Not really, no. :) Mark

On Fri, Mar 19, 2010 at 9:37 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Making hashes of int, float, Decimal *and* Fraction all compatible with one another, efficient for ints and floats, and not grossly inefficient for Fractions and Decimals, is kinda hairy, though I have a couple of ideas of how it could be done.
To elaborate, here's a cute scheme for the above, which is actually remarkably close to what Python already does for ints and floats, and which doesn't require any of the numeric types to figure out whether it's exactly equal to an instance of some other numeric type. After throwing out infinities and nans (which can easily be dealt with separately), everything we care about is a rational number, so it's enough to come up with some mapping from the set of all rationals to the set of possible hashes, subject to the requirement that the mapping can be computed efficiently for the types we care about. For any prime P there's a natural 'reduce modulo P' map reduce : {rational numbers} --> {0, 1, ..., P-1, infinity} defined in pseudocode by: reduce(m/n) = ((m % P) * inv(n, P)) % P if n % P != 0 else infinity where inv(n, P) represents the modular inverse to n modulo P. Now let P be the handy Mersenne prime P = 2**31-1 (on a 64-bit machine, the almost equally handy prime 2**61-1 could be used instead), and define a hash function from the set of rationals to the set [-2**31, 2**31) by: hash(0) = 0 hash(m/n) = 1 + reduce(m/n - 1) if m/n > 0 # i.e., reduce m/n modulo P, but to [1..P] rather than [0..P-1]. hash(m/n) = -hash(-m/n) if m/n < 0. and in the last two cases, map a result of infinity to the unused hash value -2**31. For ints, this hash function is almost identical to what Python already has, except that the current int hash does a reduction modulo 2**32-1 or 2**64-1 rather than 2**31-1. For all small ints, hash(n) == n, as currently. Either way, the hash can be computed digit-by-digit in exactly the same manner. For floats, it's also easy to compute: express the float as m * 2**e for some integers m and e, compute hash(m), and rotate e bits in the appropriate direction. And it's straightforward to implement for the Decimal and Fraction types, too. (One minor detail: as usual, some postprocessing would be required to replace a hash of -1 with something else, since a hash value of -1 is invalid.) Mark

Mark Dickinson wrote:
On Fri, Mar 19, 2010 at 9:37 AM, Mark Dickinson <dickinsm@gmail.com> wrote: For ints, this hash function is almost identical to what Python already has, except that the current int hash does a reduction modulo 2**32-1 or 2**64-1 rather than 2**31-1. For all small ints, hash(n) == n, as currently. Either way, the hash can be computed digit-by-digit in exactly the same manner. For floats, it's also easy to compute: express the float as m * 2**e for some integers m and e, compute hash(m), and rotate e bits in the appropriate direction. And it's straightforward to implement for the Decimal and Fraction types, too.
It seems to me that given the existing conflation of numeric equivalence and containment testing, going the whole hog and fixing the set membership problem for all of our rational types would be the right thing to do. Would it be worth providing the underlying implementation of the hash algorithm as a math.hash_rational function and then use it for Decimal and Fraction? That would have the virtue of making it easy for others to define numeric types that "played well" with numeric equivalence. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Mark Dickinson wrote:
It seems to me that given the existing conflation of numeric equivalence and containment testing, going the whole hog and fixing the set membership problem for all of our rational types would be the right thing to do.
Isn't this only solving half the problem, though? Even if you find that the hashes are equal, you still have to decide whether the values themselves are equal. Is there some similarly clever way of comparing two rational numbers without having explicit access to the numerators and denominators? -- Greg

On Sat, Mar 20, 2010 at 12:10 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
Mark Dickinson wrote:
It seems to me that given the existing conflation of numeric equivalence and containment testing, going the whole hog and fixing the set membership problem for all of our rational types would be the right thing to do.
Isn't this only solving half the problem, though?
Yes.
Even if you find that the hashes are equal, you still have to decide whether the values themselves are equal.
True. The reason I was concentrating on the hashes is that it's not immediately obvious that it's even *possible* to find a decent hash function that's efficient to compute and gives equal results for numerically equal inputs (regardless of type); this is especially true if you don't want to significantly slow down the existing hashes for int and float. But once that problem is solved, it shouldn't be too hard to implement all the comparisons. It *is* kinda messy, because as far as I can see the oddities of the various types mean that you end up producing specialized code for comparing each pair of types (one block of code for float<->Fraction comparisons, another for float<->Decimal, yet another for Decimal<->Fraction, and so on), but it's doable. Mark

On Sat, Mar 20, 2010 at 4:16 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
True. The reason I was concentrating on the hashes is that it's not immediately obvious that it's even *possible* to find a decent hash function that's efficient to compute and gives equal results for numerically equal inputs (regardless of type); this is especially true if you don't want to significantly slow down the existing hashes for int and float. But once that problem is solved, it shouldn't be too hard to implement all the comparisons.
It *is* kinda messy, because as far as I can see the oddities of the various types mean that you end up producing specialized code for comparing each pair of types (one block of code for float<->Fraction comparisons, another for float<->Decimal, yet another for Decimal<->Fraction, and so on), but it's doable.
I propose to reduce all hashes to the hash of a normalized fraction, which we can define as a combination of the hashes for the numerator and the denominator. Then all we have to do is figure fairly efficient ways to convert floats and decimals to normalized fractions (not necessarily Fractions). I may be naive but this seems doable: for a float, the denominator is always a power of 2 and removing factors of 2 from the denominator is easy (just right-shift until the last bit is zero). For Decimal, the unnormalized denominator is always a power of 10, and the normalization is a bit messier, but doesn't seem excessively so. The resulting numerator and denominator may be large numbers, but for typical use of Decimal and float they will rarely be excessively large, and I'm not too worried about slowing things down when they are (everything slows down when you're using really large integers anyway). -- --Guido van Rossum (python.org/~guido)

On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum <guido@python.org> wrote:
I propose to reduce all hashes to the hash of a normalized fraction, which we can define as a combination of the hashes for the numerator and the denominator. Then all we have to do is figure fairly efficient ways to convert floats and decimals to normalized fractions (not necessarily Fractions). I may be naive but this seems doable: for a float, the denominator is always a power of 2 and removing factors of 2 from the denominator is easy (just right-shift until the last bit is zero). For Decimal, the unnormalized denominator is always a power of 10, and the normalization is a bit messier, but doesn't seem excessively so. The resulting numerator and denominator may be large numbers, but for typical use of Decimal and float they will rarely be excessively large, and I'm not too worried about slowing things down when they are (everything slows down when you're using really large integers anyway).
I *am* worried about slowing things down for large Decimals: if you can't put Decimal('1e1234567') into a dict or set without waiting for an hour for the hash computation to complete (because it's busy computing 10**1234567), I consider that a problem. But it's solvable! I've just put a patch on the bug tracker: http://bugs.python.org/issue8188 It demonstrates how hashes can be implemented efficiently and compatibly for all numeric types, even large Decimals like the above. It needs a little tidying up, but it works. Mark

On Sat, Mar 20, 2010 at 4:38 PM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum <guido@python.org> wrote:
I propose to reduce all hashes to the hash of a normalized fraction, which we can define as a combination of the hashes for the numerator and the denominator. Then all we have to do is figure fairly efficient ways to convert floats and decimals to normalized fractions (not necessarily Fractions). I may be naive but this seems doable: for a float, the denominator is always a power of 2 and removing factors of 2 from the denominator is easy (just right-shift until the last bit is zero). For Decimal, the unnormalized denominator is always a power of 10, and the normalization is a bit messier, but doesn't seem excessively so. The resulting numerator and denominator may be large numbers, but for typical use of Decimal and float they will rarely be excessively large, and I'm not too worried about slowing things down when they are (everything slows down when you're using really large integers anyway).
I *am* worried about slowing things down for large Decimals: if you can't put Decimal('1e1234567') into a dict or set without waiting for an hour for the hash computation to complete (because it's busy computing 10**1234567), I consider that a problem.
But it's solvable! I've just put a patch on the bug tracker:
http://bugs.python.org/issue8188
It demonstrates how hashes can be implemented efficiently and compatibly for all numeric types, even large Decimals like the above. It needs a little tidying up, but it works.
I was interested in how the implementation worked yesterday, especially given the lack of explanation in the margins of numeric_hash3.patch. numeric_hash4.patch has much better comments, but I didn't see this patch until after I had sufficiently deciphered the previous patch and wrote most of this: http://bob.pythonmac.org/archives/2010/03/23/py3k-unified-numeric-hash/ I'm not really qualified to review the patch, what little formal math training I had has atrophied quite a bit over the years, but as far as I can tell it seems to work. The results also seem to match the Python implementations that I created. -bob

On Fri, Mar 19, 2010 at 3:07 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Fri, Mar 19, 2010 at 9:37 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Making hashes of int, float, Decimal *and* Fraction all compatible with one another, efficient for ints and floats, and not grossly inefficient for Fractions and Decimals, is kinda hairy, though I have a couple of ideas of how it could be done.
To elaborate, here's a cute scheme for the above, which is actually remarkably close to what Python already does for ints and floats, and which doesn't require any of the numeric types to figure out whether it's exactly equal to an instance of some other numeric type.
After throwing out infinities and nans (which can easily be dealt with separately), everything we care about is a rational number, so it's enough to come up with some mapping from the set of all rationals to the set of possible hashes, subject to the requirement that the mapping can be computed efficiently for the types we care about.
For any prime P there's a natural 'reduce modulo P' map
reduce : {rational numbers} --> {0, 1, ..., P-1, infinity}
defined in pseudocode by:
reduce(m/n) = ((m % P) * inv(n, P)) % P if n % P != 0 else infinity
where inv(n, P) represents the modular inverse to n modulo P.
Now let P be the handy Mersenne prime P = 2**31-1 (on a 64-bit machine, the almost equally handy prime 2**61-1 could be used instead), and define a hash function from the set of rationals to the set [-2**31, 2**31) by:
hash(0) = 0 hash(m/n) = 1 + reduce(m/n - 1) if m/n > 0 # i.e., reduce m/n modulo P, but to [1..P] rather than [0..P-1]. hash(m/n) = -hash(-m/n) if m/n < 0.
and in the last two cases, map a result of infinity to the unused hash value -2**31.
For ints, this hash function is almost identical to what Python already has, except that the current int hash does a reduction modulo 2**32-1 or 2**64-1 rather than 2**31-1. For all small ints, hash(n) == n, as currently. Either way, the hash can be computed digit-by-digit in exactly the same manner. For floats, it's also easy to compute: express the float as m * 2**e for some integers m and e, compute hash(m), and rotate e bits in the appropriate direction. And it's straightforward to implement for the Decimal and Fraction types, too. Will this change the result of hashing a long? I know that both gmpy and SAGE use their own hash implementations for performance reasons. I understand that CPython's hash function is an implementation detail, but there are external modules that rely on the existing hash behavior.
FWIW, I'd prefer 2.7 and 3.2 have the same behavior. I don't mind the existing 3.1 behavior and I'd rather not have a difference between 3.1 and 3.2. casevh
(One minor detail: as usual, some postprocessing would be required to replace a hash of -1 with something else, since a hash value of -1 is invalid.)
Mark _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/casevh%40gmail.com

On Fri, Mar 19, 2010 at 1:17 PM, Case Vanhorsen <casevh@gmail.com> wrote:
On Fri, Mar 19, 2010 at 3:07 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Fri, Mar 19, 2010 at 9:37 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Making hashes of int, float, Decimal *and* Fraction all compatible with one another, efficient for ints and floats, and not grossly inefficient for Fractions and Decimals, is kinda hairy, though I have a couple of ideas of how it could be done.
To elaborate, here's a cute scheme for the above, which is actually remarkably close to what Python already does for ints and floats, and which doesn't require any of the numeric types to figure out whether it's exactly equal to an instance of some other numeric type.
Will this change the result of hashing a long? I know that both gmpy and SAGE use their own hash implementations for performance reasons. I understand that CPython's hash function is an implementation detail, but there are external modules that rely on the existing hash behavior.
Yes, it would change the hash of a long. What external modules are there that rely on existing hash behaviour? And exactly what behaviour do they rely on? Mark

On Sat, Mar 20, 2010 at 4:06 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Fri, Mar 19, 2010 at 1:17 PM, Case Vanhorsen <casevh@gmail.com> wrote:
On Fri, Mar 19, 2010 at 3:07 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Fri, Mar 19, 2010 at 9:37 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
Making hashes of int, float, Decimal *and* Fraction all compatible with one another, efficient for ints and floats, and not grossly inefficient for Fractions and Decimals, is kinda hairy, though I have a couple of ideas of how it could be done.
To elaborate, here's a cute scheme for the above, which is actually remarkably close to what Python already does for ints and floats, and which doesn't require any of the numeric types to figure out whether it's exactly equal to an instance of some other numeric type.
Will this change the result of hashing a long? I know that both gmpy and SAGE use their own hash implementations for performance reasons. I understand that CPython's hash function is an implementation detail, but there are external modules that rely on the existing hash behavior.
Yes, it would change the hash of a long.
What external modules are there that rely on existing hash behaviour?
I'm only aware of gmpy and SAGE.
And exactly what behaviour do they rely on?
Instead of calculating hash(long(mpz)), they calculate hash(mpz) directly. It avoids creation of a temporary object that could be quite large and is faster than the two-step process. I would need to modify the code so that it continues to produce the same result. casevh
Mark

On Sat, Mar 20, 2010 at 3:17 PM, Case Vanhorsen <casevh@gmail.com> wrote:
On Sat, Mar 20, 2010 at 4:06 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
What external modules are there that rely on existing hash behaviour?
I'm only aware of gmpy and SAGE.
And exactly what behaviour do they rely on?
Instead of calculating hash(long(mpz)), they calculate hash(mpz) directly. It avoids creation of a temporary object that could be quite large and is faster than the two-step process. I would need to modify the code so that it continues to produce the same result.
Does gmpy only do this for Python 2.6? Or does it use different algorithms for 2.4/2.5 and 2.6? As far as I can tell, there was no reasonable way to compute long_hash directly at all before the algorithm was changed for 2.6, unless you imitate exactly what Python was doing (break up into 15-bit pieces, and do all the rotation and addition exactly the same way), in which case you might as well be calling long_hash directly. Mark

On Sat, Mar 20, 2010 at 10:05 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Sat, Mar 20, 2010 at 3:17 PM, Case Vanhorsen <casevh@gmail.com> wrote:
On Sat, Mar 20, 2010 at 4:06 AM, Mark Dickinson <dickinsm@gmail.com> wrote:
What external modules are there that rely on existing hash behaviour?
I'm only aware of gmpy and SAGE.
And exactly what behaviour do they rely on?
Instead of calculating hash(long(mpz)), they calculate hash(mpz) directly. It avoids creation of a temporary object that could be quite large and is faster than the two-step process. I would need to modify the code so that it continues to produce the same result.
Does gmpy only do this for Python 2.6? Or does it use different algorithms for 2.4/2.5 and 2.6? As far as I can tell, there was no reasonable way to compute long_hash directly at all before the algorithm was changed for 2.6, unless you imitate exactly what Python was doing (break up into 15-bit pieces, and do all the rotation and addition exactly the same way), in which case you might as well be calling long_hash directly.
Mark
It does the later: it converts from GMP's internal format to CPython's long format and calculates the hash along the way. I may (should :) ) revert back to converting to long and then calling long_hash. The majority of the speed increase came from the conversion improvements, not the hash calculation. I am in favor of any change that makes 2.7 and 3.2 behave the same. casevh

On 3/20/2010 7:06 AM, Mark Dickinson wrote:
Will this change the result of hashing a long? I know that both gmpy and SAGE use their own hash implementations for performance reasons. I understand that CPython's hash function is an implementation detail, but there are external modules that rely on the existing hash behavior.
Yes, it would change the hash of a long.
What external modules are there that rely on existing hash behaviour? And exactly what behaviour do they rely on?
Depending on specifics of CPython's hash is a 'do at your own risk' enterprise, like CPython bytecode hacks.

On Thu, Mar 18, 2010 at 12:41, Mark Dickinson <dickinsm@gmail.com> wrote:
I'm only seeing two arguments against this at the moment: (1) it has the potential to break code that relies on being able to sort heterogeneous lists. But given that heterogeneous lists can't be sorted stably anyway (see my earlier post about lists containing ints, floats and Decimals), perhaps this is an acceptable risk. (2) A few of the posters here (Steven, Nick, and me) feel that it's slightly more natural to allow these comparisons; but I think the argument's fairly evenly balanced at the moment between those who'd prefer an exception and those who'd prefer to allow the comparisons.
Conceptually I like the idea of them all being comparable, but are there any real use cases involving heterogeneous lists? All the examples I've seen have focused on how they're broken, not on how they'd be correct (including possible math after the comparison!) if the language compared properly. Without such use cases allowing comparison seems like a lot of work for nothing. -- Adam Olsen, aka Rhamphoryncus

On 3/18/2010 5:23 AM, Steven D'Aprano wrote:
On Thu, 18 Mar 2010 08:58:25 am Raymond Hettinger wrote:
On Mar 17, 2010, at 1:59 PM, Steven D'Aprano wrote:
On Thu, 18 Mar 2010 07:44:21 am Raymond Hettinger wrote:
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Why is 3 the best? If there is a right answer to give, surely giving the right answer it is better than not?
From the early days of the decimal module, we've thought that mixed float-decimal operations are 1) a bit perilous and 2) have few, if any good use cases.
When it comes to *arithmetic* operations, I agree. Is there anyone on python-dev willing to argue the case for allowing implicit mixed float/Decimal arithmetic operations? The arguments in the PEP seem pretty convincing to me, and I'm not suggesting we change that.
But comparison operations are different. For starters, you don't need to worry about whether to return a float or a Decimal, because you always get a bool. In theory, both Decimals and floats are representations of the same underlying thing, namely real numbers, and it seems strange to me that I can't ask whether two such real numbers are equal just because their storage implementation is different.
I can see three reasonable reasons for avoiding mixed comparisons:
(1) To avoid confusing float-naive users (but they're confused by pure float comparisons too).
(2) To avoid mixed arithmetic operations (but comparisons aren't arithmetic).
(3) If Decimals and floats compare equal, they must hash equal, and currently they don't (but Mark Dickinson thinks he has a solution for that).
Accordingly, any mixed operations should be explicit rather than implicit:
Decimal('1.1') + Decimal.from_float(2.2)
is better than:
Decimal('1.1') + 2.2
Agreed. The user should explicitly choose whether they want a float answer or a Decimal answer.
To help the user avoid confusion, we flag the latter with a TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'.
Unfortunately, in Py2.x, implicit mixed comparisons do not raise an exception, and instead will silently fail by giving
an incorrect answer: >>> Decimal('1.1')< 2.2 False
That is clearly the wrong thing to do.
Do you envisage any problems from allowing this instead?
Decimal('1.1')< 2.2
True
Yes. As any non-naïve float user is aware, the proper form of float comparisons is not to use < or > or == or !=, but rather, instead of using < (to follow along with your example), one should use: Decimal('1.1') - 2.2 < epsilon However, while even this is only useful in certain circumstances as a gross simplification [1], it immediately shows the need to do mixed arithmetic to produce (sometimes) correct results. More correct comparisons require much more code (even the 20-line C code in [1], which understands the float format to some extent, admits to being deficient in some circumstances). For all the reasons that mixed decimal and float arithmetic is bad, mixed decimal and float comparisons are also bad. To do proper comparisons, you need to know the number of significant digits of both numbers, and the precision and numeric ranges being dealt with by the application. For the single purpose of sorting, one could make an argument that not knowing the significant digits, precision, and numeric ranges, that the sort would probably produce results where floats and decimals that should compare equal would be clustered similarly as they would if the significant digits, precision, and numeric ranges were known, and that would probably be close to truth, but only if the decimal vs float key were the last in the composite sort key. I don't think Python informs its comparison operations that it is being used as part of a sort, nor would there be a way for user-written sorts to inform the comparison operations of that fact. Seems like it would be better to raise an exception, and in the documentation for the exception point out that turning off the exception (if it should be decided that that should be possible, which could be good for compatibility), would regress to the current behavior, which doesn't sort numerically, but by type. [1] http://www.cprogramming.com/tutorial/floating_point/understanding_floating_p... Glenn

Glenn Linderman <v+python <at> g.nevcal.com> writes:
For all the reasons that mixed decimal and float arithmetic is bad, mixed decimal and float comparisons are also bad. To do proper comparisons, you need to know the number of significant digits of both numbers, and the precision and numeric ranges being dealt with by the application.
What is a "proper comparison"? A comparison is a lossy operation by construction (it returns only 1 bit of output, or 2 bits for rich comparison), so the loss in precision that results from mixing float and decimal operands should not be a concern here. Unless there are situations where the comparison algorithm might return wrong results (are there?), I don't see why we should forbid such comparisons.
Seems like it would be better to raise an exception, and in the documentation for the exception point out that turning off the exception (if it should be decided that that should be possible, which could be good for compatibility), would regress to the current behavior, which doesn't sort numerically, but by type.
Regressing to an useless behaviour sounds like a deliberate annoyance. I don't think this proposal should be implemented. Regards Antoine.

On 18/03/2010 18:44, Antoine Pitrou wrote:
[snip...]
Seems like it would be better to raise an exception, and in the documentation for the exception point out that turning off the exception (if it should be decided that that should be possible, which could be good for compatibility), would regress to the current behavior, which doesn't sort numerically, but by type.
Regressing to an useless behaviour sounds like a deliberate annoyance. I don't think this proposal should be implemented.
I agree, comparisons here have completely defined semantics - it sounds crazy not to allow it. (The argument 'because some programmers might do it without realising' doesn't hold much water with me.) Michael
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Fri, 19 Mar 2010 05:27:06 am Glenn Linderman wrote:
Do you envisage any problems from allowing this instead?
Decimal('1.1')< 2.2
True
Yes.
As any non-naïve float user is aware, the proper form of float comparisons is not to use < or > or == or !=, but rather, instead of using < (to follow along with your example), one should use:
Decimal('1.1') - 2.2 < epsilon
And yet we allow 1.1 < 2.2 instead of forcing users to do the "proper form". One can only wonder why the various standards (actual and de-facto) for floating point allows comparisons at all. But they do, and so does Python, and frankly even if the only reason is to satisfy lazy coders who don't have a requirement for high accuracy, then that's a good reason in my book, and one equally applicable to Decimal and float. -- Steven D'Aprano

On 3/18/2010 12:34 PM, Steven D'Aprano wrote:
On Fri, 19 Mar 2010 05:27:06 am Glenn Linderman wrote:
Do you envisage any problems from allowing this instead?
Decimal('1.1')< 2.2
True
Yes.
As any non-naïve float user is aware, the proper form of float comparisons is not to use< or> or == or !=, but rather, instead of using< (to follow along with your example), one should use:
Decimal('1.1') - 2.2< epsilon
And yet we allow
1.1< 2.2
instead of forcing users to do the "proper form". One can only wonder why the various standards (actual and de-facto) for floating point allows comparisons at all.
Hard to tell 1.1 < 2.2 from the second line of diff = 1.1 - 2.2 diff < epsilon Hard to enforce the "proper form", without being omniscient, and the "proper form" isn't always as simple as my example, to be truly proper often requires a lot more code (and a lot more omniscience). I'm +1 on adding an exception for Decimal/float comparisons, -0 on allowing it to be turned off to achieve compatible behavior, since Raymond pointed out that the compatible sorting behavior isn't stable in the presence of int, Decimal, and float.
But they do, and so does Python, and frankly even if the only reason is to satisfy lazy coders who don't have a requirement for high accuracy, then that's a good reason in my book, and one equally applicable to Decimal and float.
A point well understood, but I personally would rather force Decimal/float comparisons to be explicitly converted to be lazily compared, if the arithmetic operations are forced to be explicit.

On 2010-03-18 13:27 PM, Glenn Linderman wrote:
As any non-naïve float user is aware, the proper form of float comparisons is not to use < or > or == or !=, but rather, instead of using < (to follow along with your example), one should use:
Decimal('1.1') - 2.2 < epsilon
Not at all. This is quite incorrect for most use cases. Fuzzy comparisons are sometimes useful for equality testing, but almost never for ordering. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On 3/18/2010 12:45 PM, Robert Kern wrote:
On 2010-03-18 13:27 PM, Glenn Linderman wrote:
As any non-naïve float user is aware, the proper form of float comparisons is not to use < or > or == or !=, but rather, instead of using < (to follow along with your example), one should use:
Decimal('1.1') - 2.2 < epsilon
Not at all. This is quite incorrect for most use cases. Fuzzy comparisons are sometimes useful for equality testing, but almost never for ordering.
I wondered if anyone would catch me on that! I did it mostly to keep the same example going, but partly to see if there were any real floating point experts in the discussion! You'll have noted my reference was to using the technique for equality.

On Wed, Mar 17, 2010 at 1:44 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 17, 2010, at 12:34 PM, Terry Reedy wrote:
I agree. Improving 2.7 and not 3.2+ would give people a reason to not move to 3.x.
FWIW, I think this is mischaracterizing the proposal.
The spectrum of options from worst to best is 1) compare but give the wrong answer 2) compare but give the right answer 3) refuse to compare.
Py3.x is already in the best position, it refuses to compare. IOW, is already is as improved as it can get.
P2.6 is in the worst position. The proposal is to make it better, but not as good as 3.x.
Some people have argued that (2) is better than (3). At the very least, there is a marked discontinuity in this spectrum between (2) and (3). With (1), users are less likely to rely on this feature in 2.x and then have no recourse when converting to 3.x. -- --Guido van Rossum (python.org/~guido)

On Mar 16, 2010, at 9:41 AM, Guido van Rossum wrote:
I'd say if you're not going to forward-port this to Python 3, it shouldn't go into Python 2 -- in that case it would make more sense to me to back-port the exception-raising behavior.
Python 3 doesn't need it because it is possible to not give a result at all. Python 2 does need it because we have to give *some* result.
Also supporting comparisons but not other mixed operations is going to be confusing. If you are sticking to that behavior I think mixed comparisons should also be ruled out.
The difference is that mixed comparisons currently do give a result, but one that is non-sensical. The proposal is a make in give a meaningful result, not as an extra feature, but in an effort to not be wrong. Since 2.x has to give a result, we should make it useful. Since 3.x does not make the comparison, it is okay to punt. Raymond

On Tue, Mar 16, 2010 at 5:36 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 16, 2010, at 9:41 AM, Guido van Rossum wrote:
I'd say if you're not going to forward-port this to Python 3, it shouldn't go into Python 2 -- in that case it would make more sense to me to back-port the exception-raising behavior.
Python 3 doesn't need it because it is possible to not give a result at all. Python 2 does need it because we have to give *some* result.
Also supporting comparisons but not other mixed operations is going to be confusing. If you are sticking to that behavior I think mixed comparisons should also be ruled out.
The difference is that mixed comparisons currently do give a result, but one that is non-sensical. The proposal is a make in give a meaningful result, not as an extra feature, but in an effort to not be wrong.
Since 2.x has to give a result, we should make it useful. Since 3.x does not make the comparison, it is okay to punt.
No. You are talking like there is no path from 2.7 to 3.x. Since the result in 2.x was never useful, you should not add a dead-end feature. And there is no reason (other than backwards compatibility) why the comparison couldn't raise an exception in 2.x as well -- e.g. str<->unicode comparisons do this already. (It used to be required that comparisons had to always return a result, but that was done away with long before Decimal was introduced.) -- --Guido van Rossum (python.org/~guido)

On Wed, Mar 17, 2010 at 12:36 AM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Mar 16, 2010, at 9:41 AM, Guido van Rossum wrote:
Also supporting comparisons but not other mixed operations is going to be confusing. If you are sticking to that behavior I think mixed comparisons should also be ruled out.
The difference is that mixed comparisons currently do give a result, but one that is non-sensical. The proposal is a make in give a meaningful result, not as an extra feature, but in an effort to not be wrong.
Yes, exactly. A number of people (including me, early in the issue tracker discussion) have suggested that it would be confusing/inconsistent/strange to allow mixed-type comparisons but not mixed-type arithmetic. But it's not clear to me exactly why this would be a bad thing. It seems likely that for serious applications there's little benefit in allowing float<->Decimal comparisons if mixed-type arithmetic isn't allowed: if you're checking that b > a, it's probably because you're about to subtract b from a, or divide b by a, or do some other piece of arithmetic that depends on that condition being true. But for simply playing around with numbers on the command line, and for helping free people from their floating-point misconceptions, I think having these float<->Decimal comparisons available is potentially useful. I understand the argument about not adding this feature to 2.x if it's not going to go into 3.x; but I'm convinced enough that the 2.x behaviour should change that I'd prefer to add it to both 2.x and 3.x than to neither. The other option would be to leave 3.x alone and just add a py3k warning to 2.x, so that at least there's some indication of where strange results are coming from. -- Mark

Raymond Hettinger wrote:
Python 3 doesn't need it because it is possible to not give a result at all. Python 2 does need it because we have to give *some* result.
That's not true -- it's possible for comparisons to raise an exception in 2.x, and they sometimes do already: Python 2.5.4 (r254:67916, May 15 2009, 15:21:20) [GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
1+2j < 3+4j Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: no ordering relation is defined for complex numbers
-- Greg

On Mar 17, 2010, at 5:02 PM, Greg Ewing wrote:
Raymond Hettinger wrote:
Python 3 doesn't need it because it is possible to not give a result at all. Python 2 does need it because we have to give *some* result.
That's not true -- it's possible for comparisons to raise an exception in 2.x, and they sometimes do already:
Complex objects do not support __float__. Decimal objects do. If an object supports __float__, then a float comparison coerces its other argument via __float__ and the other argument never gets a chance to raise an exception.
class D: def __float__(self): return 3.14
float(D()) 3.1400000000000001 float(complex(3.14))
Traceback (most recent call last): File "<pyshell#14>", line 1, in <module> float(complex(3.14)) TypeError: can't convert complex to float
D() < 10.0 True complex(3.14) < 10.0
Traceback (most recent call last): File "<pyshell#16>", line 1, in <module> complex(3.14) < 10.0 TypeError: no ordering relation is defined for complex numbers Raymond

Mark Dickinson wrote:
On the one hand there's something to be said for maintaining a clean separation between the float and Decimal types, allowing only explicit conversions from one to the other; mixed-type arithmetic between floats and Decimals was very deliberately not permitted in the original PEP, and that's unlikely to change in a hurry.
I think that as long as you're disallowing arithmetic between float and decimal, comparison should be disallowed as well. So if you're going to change anything, it would be better to make such comparisons raise a TypeError in 2.7 to match the current 3.x behaviour. -- Greg
participants (17)
-
Adam Olsen
-
Antoine Pitrou
-
Bob Ippolito
-
Case Vanhorsen
-
Facundo Batista
-
Glenn Linderman
-
Glenn Linderman
-
Greg Ewing
-
Guido van Rossum
-
Mark Dickinson
-
Michael Foord
-
Nick Coghlan
-
R. David Murray
-
Raymond Hettinger
-
Robert Kern
-
Steven D'Aprano
-
Terry Reedy