Re: [Python-Dev] Decimal <-> float comparisons in py3k.
At 09:58 AM 3/16/2010 -0500, Facundo Batista wrote:
I'm +0 to allow these comparisons, being "Decimal(1) < .3" the same as "Decimal(1) < Decimal.from_float(.3)"
Does Decimal.from_float() use the "shortest decimal representation" approach? If not, it might be confusing if a number that prints as '.1' compares unequal to Decimal('.1').
On Tue, Mar 16, 2010 at 3:58 PM, P.J. Eby
At 09:58 AM 3/16/2010 -0500, Facundo Batista wrote:
I'm +0 to allow these comparisons, being "Decimal(1) < .3" the same as "Decimal(1) < Decimal.from_float(.3)"
Does Decimal.from_float() use the "shortest decimal representation" approach?
No. It does exact conversions:
Decimal.from_float(1.1) Decimal('1.100000000000000088817841970012523233890533447265625') Decimal.from_float(1.1) == 1.1 False Decimal('1.1') == float('1.1') # returns False both pre- and post-patch False
If not, it might be confusing if a number that prints as '.1' compares unequal to Decimal('.1').
Agreed, but this is just your everyday floating-point confusion, to be dealt with by social means (e.g., educating the programmer). Any technical solution that made "Decimal('1.1') == float('1.1')" evaluate to True would, I suspect, be a cure worse than the original disease. Mark
On Wed, 17 Mar 2010 03:23:30 am Mark Dickinson wrote:
On Tue, Mar 16, 2010 at 4:11 PM, Mark Dickinson
wrote: [...] Decimal.from_float(1.1) == 1.1
False
Whoops. To clarify, this is the pre-patch behaviour; post-patch, this gives True.
Whew! You had me worried there for a second. Just to clarify, you are proposing: Decimal.from_float(1.1) == 1.1 Decimal.('1.1') != float('1.1') +1 on this behaviour, even in the absence of supporting mixed Decimal and float arithmetic operations. Both Decimals and floats are representations of real numbers, and not being able to compare two numbers is just weird: refusing to compare (say) Decimal(1) with float(1) makes as little sense to me as refusing to compare int(1) with float(1). But mixed arithmetic runs into the problem, what do you want the result type to be? Given (say) decimal+float, returning either a Decimal or a float will be the wrong thing to do some of the time, so better to prohibit mixed arithmetic and let folks handle their own conversions. So +1 on continuing to prohibit mixed arithmetic. But no such problems arise with comparisons, which will always return a bool, and will avoid the current ... interesting ... behaviour. In 3.1:
Decimal(1) == 1 == 1.0 True Decimal(1) == 1.0 False Decimal.from_float(1.0) == 1 == 1.0 True Decimal.from_float(1.0) == 1.0 False
Replacing False with an exception doesn't make it any less bizarre. -- Steven D'Aprano
Steven D'Aprano wrote:
But no such problems arise with comparisons, which will always return a bool, and will avoid the current ... interesting ... behaviour. In 3.1:
Decimal(1) == 1 == 1.0 True Decimal(1) == 1.0 False Decimal.from_float(1.0) == 1 == 1.0 True Decimal.from_float(1.0) == 1.0 False
Replacing False with an exception doesn't make it any less bizarre.
Allowing the comparisons also doesn't introduce the potential for large cumulative errors which are possible when actual implicit arithmetic conversions are allowed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Tue, Mar 16, 2010 at 2:32 PM, Steven D'Aprano
But mixed arithmetic runs into the problem, what do you want the result type to be? Given (say) decimal+float, returning either a Decimal or a float will be the wrong thing to do some of the time, so better to prohibit mixed arithmetic and let folks handle their own conversions. So +1 on continuing to prohibit mixed arithmetic.
I'm not disagreeing, but I really wonder, what is the value of supporting mixed comparisons then? Just because you *can* assign a meaning to it doesn't mean you should. OTOH I'm sure a lot of people would like to see mixed arithmetic supported, the PEP be damned, and they would probably be happy with any simple rule about the return type even if it's not always ideal. I note that there are cases where converting a long to a float also is the wrong thing to do, and yet mixed long/float operations always return floats. If you are amenable to this argument, I would propose to make the result of mixed operations return a Decimal, since in some "intuitive complexity" sense an int is a simpler type than a float and a float is a simpler type than a Decimal -- so results return the more complex type. But my intuition on this isn't super strong and I could live with always returning a float as well -- there are always casts to force the issue. -- --Guido van Rossum (python.org/~guido)
On Wed, 17 Mar 2010 10:01:12 am Guido van Rossum wrote:
On Tue, Mar 16, 2010 at 2:32 PM, Steven D'Aprano
wrote: But mixed arithmetic runs into the problem, what do you want the result type to be? Given (say) decimal+float, returning either a Decimal or a float will be the wrong thing to do some of the time, so better to prohibit mixed arithmetic and let folks handle their own conversions. So +1 on continuing to prohibit mixed arithmetic.
I'm not disagreeing, but I really wonder, what is the value of supporting mixed comparisons then? Just because you *can* assign a meaning to it doesn't mean you should.
Any mixed comparison with floats is already risky:
1e20 + 1e4 == 10**20 + 10**4 False
but we allow them anyway because they're too useful and too obvious not to. The problems with (say) floating point equality are far too well known to need me give examples, but I will anyway *wink*:
1.3 == 3.7 - 2.4 False
but we allow it anyway, partly from tradition ("floats have always been like that") and partly because, frankly, laziness is a virtue. It would be painful not to be able to say (e.g.):
1.25 == 3.75 - 2.5 True 10.0 > 1 True
My defence of mixed comparisons is the same. If I'm doing serious maths work, then I'm going to be comparing numbers carefully, and not just tossing a comparison operator between them. But for simple calculations, or using the interactive interpreter as a calculator, or if I need to sort a list of mixed numbers without caring whether they are ints or floats or Decimals, laziness is a virtue. If your needs for accuracy aren't high, you can go far by comparing floats with ==. Why shouldn't the same apply to mixed floats and Decimals? Having said all that, I've just re-read the PEP, and spotted a fly in the ointment... hash. If we allow Decimals to compare equal to floats, that presumably implies that they need to hash equal. That may be simple enough for integer valued floats, but what non-integer values? -- Steven D'Aprano
On Wed, Mar 17, 2010 at 11:43 AM, Steven D'Aprano
Having said all that, I've just re-read the PEP, and spotted a fly in the ointment... hash.
If we allow Decimals to compare equal to floats, that presumably implies that they need to hash equal. That may be simple enough for integer valued floats, but what non-integer values?
That is indeed an issue. It can be taken care by checking whether the Decimal instance is exactly representable as a float, and returning the hash of the corresponding float if so. From the patch (http://bugs.python.org/file16544/issue2531.patch): + self_as_float = float(self) + if Decimal.from_float(self_as_float) == self: + return hash(self_as_float) Strictly speaking this only works if the Decimal -> float conversion (which relies on the platform strtod) is accurate to within <1ulp, but that's going to be true in 2.7/3.x on all platforms using David Gay's str<->float conversion code, and is *probably* true for the system strtod on any sane platform, so I don't think it's a real issue. This also slows down the (already slow) hash computation for Decimal a touch. Until people start complaining, or show evidence of applications that require making large dictionaries of Decimal instances, I'm not going to worry about that either. It would be easy to cache the hash value of a Decimal instance if it became necessary. A less messy (but more intrusive) alternative would be to come up with a single sane hash function defined for all rational numbers (plus infinities and nans), and use this function for int, long, float, Decimal and Fraction types. I have a candidate function in mind... Mark
Guido van Rossum wrote:
in some "intuitive complexity" sense an int is a simpler type than a float and a float is a simpler type than a Decimal
I don't think this analogy holds. In a mathematical sense, ints are a subset of reals, but binary and decimal floats are just alternative approximate representations of reals, neither one being inherently preferable over the other. One could argue that since all binary floats are exactly representable in decimal but not vice versa, decimal should be regarded as the wider type. But even this doesn't hold when you have a limited number of decimal digits available, which you always do at any given moment with the Decimal type. And even if there are enough digits, an exact conversion mightn't be what you really want. This problem doesn't arise with int->float conversion -- there is only one obvious way of chopping it to fit. -- Greg
Guido van Rossum wrote:
On Tue, Mar 16, 2010 at 2:32 PM, Steven D'Aprano
wrote: But mixed arithmetic runs into the problem, what do you want the result type to be? Given (say) decimal+float, returning either a Decimal or a float will be the wrong thing to do some of the time, so better to prohibit mixed arithmetic and let folks handle their own conversions. So +1 on continuing to prohibit mixed arithmetic.
I'm not disagreeing, but I really wonder, what is the value of supporting mixed comparisons then? Just because you *can* assign a meaning to it doesn't mean you should.
OTOH I'm sure a lot of people would like to see mixed arithmetic supported, the PEP be damned, and they would probably be happy with any simple rule about the return type even if it's not always ideal. I note that there are cases where converting a long to a float also is the wrong thing to do, and yet mixed long/float operations always return floats.
I suspect this latter behavior is a throwback to the days when conversion of an integer to a float was guaranteed not to cause overflow. With the extension of the integer type to longs that theory went out of the window, and with it the previously manageable "widening" that took place.
If you are amenable to this argument, I would propose to make the result of mixed operations return a Decimal, since in some "intuitive complexity" sense an int is a simpler type than a float and a float is a simpler type than a Decimal -- so results return the more complex type. But my intuition on this isn't super strong and I could live with always returning a float as well -- there are always casts to force the issue.
Alas Python now supports so many number systems that there is no longer a rational (in the non-numerical sense) ordering of the systems which allows a linear progression from one to the next. It therefore behooves us to consider the implications of such a decision *very* carefully. intuition alone (even yours, which I would back against most people's) may not suffice. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See PyCon Talks from Atlanta 2010 http://pycon.blip.tv/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS: http://holdenweb.eventbrite.com/
On Tue, Mar 16, 2010 at 10:32 PM, Steven D'Aprano
On Wed, 17 Mar 2010 03:23:30 am Mark Dickinson wrote:
On Tue, Mar 16, 2010 at 4:11 PM, Mark Dickinson
wrote: [...] Decimal.from_float(1.1) == 1.1
False
Whoops. To clarify, this is the pre-patch behaviour; post-patch, this gives True.
Whew! You had me worried there for a second. Just to clarify, you are proposing:
Decimal.from_float(1.1) == 1.1 Decimal.('1.1') != float('1.1')
Exactly, yes. -- Mark
Mark Dickinson wrote:
On Tue, Mar 16, 2010 at 3:58 PM, P.J. Eby
wrote: If not, it might be confusing if a number that prints as '.1' compares unequal to Decimal('.1').
Agreed, but this is just your everyday floating-point confusion, to be dealt with by social means (e.g., educating the programmer).
Seems to me that this education would mostly consist of saying "don't compare floats and decimals", which is why I think that disallowing them in the first place would be better. Then if a programmer truly needs to compare them for some reason, he has to be explicit about how to do it. -- Greg
On Wed, 17 Mar 2010 09:16:11 am Greg Ewing wrote:
Mark Dickinson wrote:
On Tue, Mar 16, 2010 at 3:58 PM, P.J. Eby
wrote: If not, it might be confusing if a number that prints as '.1' compares unequal to Decimal('.1').
Agreed, but this is just your everyday floating-point confusion, to be dealt with by social means (e.g., educating the programmer).
Seems to me that this education would mostly consist of saying "don't compare floats and decimals", which is why I think that disallowing them in the first place would be better.
I'm sure you don't mean to suggest that the only (or even the main) source of floating point confusion comes from mixed Decimal/float operations.
Then if a programmer truly needs to compare them for some reason, he has to be explicit about how to do it.
More explicit than someDecimal == someFloat? Seems pretty explicit to me. -- Steven D'Aprano
Steven D'Aprano wrote:
More explicit than someDecimal == someFloat? Seems pretty explicit to me.
Yes. I mean at least specifying either float(someDecimal) == someFloat or someDecimal == Decimal(someFloat). Preferably also whether the conversion is to be as exact as possible or on a minimum-digits basis. -- Greg
On Mar 16, 2010, at 3:16 PM, Greg Ewing wrote:
Seems to me that this education would mostly consist of saying "don't compare floats and decimals", which is why I think that disallowing them in the first place would be better.
That makes sense. I do worry that 2.x currently does make the comparison and gives the wrong answer. We have the ability to make it a correct answer. But, it seems like the mood here is that wrong-is-better-than-right for an action that someone shouldn't be doing in the first place. Raymond
On Wed, 17 Mar 2010 12:27:01 pm Raymond Hettinger wrote:
On Mar 16, 2010, at 3:16 PM, Greg Ewing wrote:
Seems to me that this education would mostly consist of saying "don't compare floats and decimals", which is why I think that disallowing them in the first place would be better.
That makes sense.
I do worry that 2.x currently does make the comparison and gives the wrong answer. We have the ability to make it a correct answer. But, it seems like the mood here is that wrong-is-better-than-right for an action that someone shouldn't be doing in the first place.
I don't get this. Why is it "wrong" to compare Decimals to floats, and why shouldn't I do so? What harm is there? If the argument is that naive users who don't understand floats may be confused by the results, then the problem lies with floats, and if you really want to avoid confusing the float-naive, then we should prohibit all comparisons on floats:
1e20 + 1e4 < 1e20 + 2e4 False
I don't mean that as a serious suggestion -- it would be absurd to cripple floats for the sake of avoiding confusion of those who don't understand floats. Why are Decimals different? I can't see why comparing Decimal(1) to float(1) is wrong in any sense. I can see that comparing Decimal("1.1") to float("1.1") may confuse the float-naive, but the float naive will be confused by this too:
x = 1.0/3 x + 1.0 - 1.0 == x False
There's an awful lot about floats that is confusing to naive users, I don't see that the behaviour of Decimals will make it worse. -- Steven D'Aprano
On Tue, Mar 16, 2010 at 10:16 PM, Greg Ewing
Mark Dickinson wrote:
On Tue, Mar 16, 2010 at 3:58 PM, P.J. Eby
wrote: If not, it might be confusing if a number that prints as '.1' compares unequal to Decimal('.1').
Agreed, but this is just your everyday floating-point confusion, to be dealt with by social means (e.g., educating the programmer).
Seems to me that this education would mostly consist of saying "don't compare floats and decimals", which is why I think that disallowing them in the first place would be better.
I was thinking of something more along the lines of: "Sure, go ahead and compare floats and decimals, but be aware that float('1.1') is not exactly 1.1, so don't complain when 1.1 == Decimal('1.1') returns False." For me, this actually a plus of allowing these comparisons: it makes the education easier. "Look, the binary float stored for 1.1 is actually larger than 1.1, and here's the proof: >>> 1.1 > Decimal('1.1') -> True." -- Mark
Le mardi 16 mars 2010 16:58:22, P.J. Eby a écrit :
At 09:58 AM 3/16/2010 -0500, Facundo Batista wrote:
I'm +0 to allow these comparisons, being "Decimal(1) < .3" the same as "Decimal(1) < Decimal.from_float(.3)"
Does Decimal.from_float() use the "shortest decimal representation" approach?
If not, it might be confusing if a number that prints as '.1' compares unequal to Decimal('.1').
In py3k, comparing bytes and str raise a TypeError("unorderable types: bytes() < str()"). I like this behaviour :-) If comparaison of Decimal and float can have "unpredictable" result, I would suggest the same behaviour (raise an error). -- Victor Stinner http://www.haypocalc.com/
On Tue, Mar 16, 2010 at 4:18 PM, Victor Stinner
If comparaison of Decimal and float can have "unpredictable" result, I would suggest the same behaviour (raise an error).
Well, it's not really `unpredictable': the new behaviour is perfectly predictable and sane, provided only that you remember the basic fact that float("0.1") is not exactly 0.1. Mark
participants (9)
-
Greg Ewing
-
Guido van Rossum
-
Mark Dickinson
-
Nick Coghlan
-
P.J. Eby
-
Raymond Hettinger
-
Steve Holden
-
Steven D'Aprano
-
Victor Stinner