Re: [Python-ideas] [Python Ideas] Python Float Update

On Wed, Jun 3, 2015 at 3:35 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
You can't represent sqrt(2) exactly with sign and mantissa either. When Decimal detects a non-repeating decimal, it should round it, and assign it a numerator and denominator something like 14142135623730951 / 10000000000000000 simplified. That's better than sign and mantissa errors. Or an alternative could be a hybrid of sign and mantissa and fraction representation... I don't think that's a good idea though. -- -Surya Subbarao

On Jun 3, 2015, at 15:46, u8y7541 The Awesome Person <surya.subbarao1@gmail.com> wrote:
That's exactly the point: Decimal never _pretends_ to be exact, and therefore there's no problem when it can't be. By the way, it's not just "sign and mantissa" (that just gives you an integer, or maybe a fixed-point number), it's sign, mantissa, _and exponent_.
No, that's exactly the same value as mantissa 1.4142135623730951 and exponent 0, and therefore it has exactly the same error. You haven't gained anything over using Decimal. And meanwhile, you've lost some efficiency (it takes twice as much memory because you have to store all those zeroes, where in Decimal they're implied by the exponent), and you've lost the benefit of a well-designed standard to follow (how many digits should you keep? what rounding rule should you use? should there be some way to optionally signal the user that rounding has occurred? and so on...). And, again, you've made things more surprising, not less, because now you have a type that's always exact, except when it isn't. Meanwhile, when you asked about the problems, I gave you a whole list of them. Have you thought about the others, or only the third one on the list? For example, do you really want adding up a long string of simple numbers to give you a value that takes 500x as much memory to store and 500x as long to calculate with if you don't need the exactness? Or is there going to be another rounding rule that when the fraction gets "too big" you truncate it to a smaller approximation? And meanwhile, if you do need the exactness, why don't you need to be able to carry around exact rational multiplies of pi or an exact representation of 2 ** 0.5 (both of which SymPy can do for you, by representing numbers symbolically, the way humans do when they need to)?

(it takes twice as much memory because you have to store all those zeroes, where in Decimal they're implied by the exponent), and you've lost the benefit of a well-designed standard to follow (how many digits should you keep? what rounding rule should you use? should there be some way to optionally signal the user that rounding has occurred? and so on...).
You are right about memory... LOL, I just thought about having something like representing it as a float / float for numerator / denominator! But that would be slower... There's got to be a workaround for those zeros. Especially if I'm dealing with stuff like 57 / 10^100 (57 is prime!). -- -Surya Subbarao

At this point I feel compelled to explain why I'm against using fractions/rationals to represent numbers given as decimals.
The design using arbitrary precision fractions was intended to avoid newbie issues with decimal numbers (these threads have elaborated plenty on those newbie issues). For reasons that should also be obvious by now, we converted these fractions back to decimal before printing them. But there was a big issue that we didn't anticipate. During the course of a simple program it was quite common for calculations to slow down dramatically, because numbers with ever-larger numerators and denominators were being computed (and rational arithmetic quickly slows down as those get bigger). So e.g. you might be computing your taxes with a precision of a million digits -- only to be rounding them down to dollars for display. These issues were quite difficult to debug because the normal approach to debugging ("just use print statements") didn't work -- unless you came up with the idea of printing the numbers as a fraction. For this reason I think that it's better not to use rational arithmetic by default. FWIW the same reasoning does *not* apply to using Decimal or something like decimal128. But then again those don't really address most issues with floating point -- the rounding issue exists for decimal as well as for binary. Anyway, that's a separate discussion to have. -- --Guido van Rossum (python.org/~guido)

On Wed, Jun 3, 2015 at 9:01 PM, Guido van Rossum <guido@python.org> wrote:
The problem of unlimited growth can be solved by rounding, but the result is in many ways worse that floating point numbers. One obvious problem is that unlike binary floating point where all bit patterns represent different numbers, only about 60% of fractions with limited numerators and denominators represent unique values. The rest are reducible by dividing the numerator and denominator by the GCD. Furthermore, the fractions with limited numerators are distributed very unevenly on the number line. This problem is present in binary floats as well: floats between 1 and 2 are twice as dense as floats between 2 and 4, but with fractions it is much worse. Since a/b - c/d = (ad-bc)/(bd), a fraction nearest to a/b is at a distance of 1/(bd) from it. So if the denominators are limited by D (|b| < D and |d| < D), for small b's the nearest fraction to a/b is at distance ~ 1/D, but if b ~ D, it is at a distance of 1/D^2. For example, if we limit denominators to 10 decimal digits, the gaps between fractions can vary from ~ 10^(-10) to ~ 10^(-20) even if the fractions are of similar magnitude - say between 1 and 2. These two problems rule out the use of fractions as a general purpose number.

On Jun 3, 2015, at 15:46, u8y7541 The Awesome Person <surya.subbarao1@gmail.com> wrote:
That's exactly the point: Decimal never _pretends_ to be exact, and therefore there's no problem when it can't be. By the way, it's not just "sign and mantissa" (that just gives you an integer, or maybe a fixed-point number), it's sign, mantissa, _and exponent_.
No, that's exactly the same value as mantissa 1.4142135623730951 and exponent 0, and therefore it has exactly the same error. You haven't gained anything over using Decimal. And meanwhile, you've lost some efficiency (it takes twice as much memory because you have to store all those zeroes, where in Decimal they're implied by the exponent), and you've lost the benefit of a well-designed standard to follow (how many digits should you keep? what rounding rule should you use? should there be some way to optionally signal the user that rounding has occurred? and so on...). And, again, you've made things more surprising, not less, because now you have a type that's always exact, except when it isn't. Meanwhile, when you asked about the problems, I gave you a whole list of them. Have you thought about the others, or only the third one on the list? For example, do you really want adding up a long string of simple numbers to give you a value that takes 500x as much memory to store and 500x as long to calculate with if you don't need the exactness? Or is there going to be another rounding rule that when the fraction gets "too big" you truncate it to a smaller approximation? And meanwhile, if you do need the exactness, why don't you need to be able to carry around exact rational multiplies of pi or an exact representation of 2 ** 0.5 (both of which SymPy can do for you, by representing numbers symbolically, the way humans do when they need to)?

(it takes twice as much memory because you have to store all those zeroes, where in Decimal they're implied by the exponent), and you've lost the benefit of a well-designed standard to follow (how many digits should you keep? what rounding rule should you use? should there be some way to optionally signal the user that rounding has occurred? and so on...).
You are right about memory... LOL, I just thought about having something like representing it as a float / float for numerator / denominator! But that would be slower... There's got to be a workaround for those zeros. Especially if I'm dealing with stuff like 57 / 10^100 (57 is prime!). -- -Surya Subbarao

At this point I feel compelled to explain why I'm against using fractions/rationals to represent numbers given as decimals.
The design using arbitrary precision fractions was intended to avoid newbie issues with decimal numbers (these threads have elaborated plenty on those newbie issues). For reasons that should also be obvious by now, we converted these fractions back to decimal before printing them. But there was a big issue that we didn't anticipate. During the course of a simple program it was quite common for calculations to slow down dramatically, because numbers with ever-larger numerators and denominators were being computed (and rational arithmetic quickly slows down as those get bigger). So e.g. you might be computing your taxes with a precision of a million digits -- only to be rounding them down to dollars for display. These issues were quite difficult to debug because the normal approach to debugging ("just use print statements") didn't work -- unless you came up with the idea of printing the numbers as a fraction. For this reason I think that it's better not to use rational arithmetic by default. FWIW the same reasoning does *not* apply to using Decimal or something like decimal128. But then again those don't really address most issues with floating point -- the rounding issue exists for decimal as well as for binary. Anyway, that's a separate discussion to have. -- --Guido van Rossum (python.org/~guido)

On Wed, Jun 3, 2015 at 9:01 PM, Guido van Rossum <guido@python.org> wrote:
The problem of unlimited growth can be solved by rounding, but the result is in many ways worse that floating point numbers. One obvious problem is that unlike binary floating point where all bit patterns represent different numbers, only about 60% of fractions with limited numerators and denominators represent unique values. The rest are reducible by dividing the numerator and denominator by the GCD. Furthermore, the fractions with limited numerators are distributed very unevenly on the number line. This problem is present in binary floats as well: floats between 1 and 2 are twice as dense as floats between 2 and 4, but with fractions it is much worse. Since a/b - c/d = (ad-bc)/(bd), a fraction nearest to a/b is at a distance of 1/(bd) from it. So if the denominators are limited by D (|b| < D and |d| < D), for small b's the nearest fraction to a/b is at distance ~ 1/D, but if b ~ D, it is at a distance of 1/D^2. For example, if we limit denominators to 10 decimal digits, the gaps between fractions can vary from ~ 10^(-10) to ~ 10^(-20) even if the fractions are of similar magnitude - say between 1 and 2. These two problems rule out the use of fractions as a general purpose number.
participants (7)
-
Alexander Belopolsky
-
Andrew Barnert
-
Chris Angelico
-
Greg Ewing
-
Guido van Rossum
-
MRAB
-
u8y7541 The Awesome Person