Consider generalizing Decimal to support arbitrary radix

Arbitrary radix comes up every now and then and Decimal already has a radix() method. It would be nice when initializing a Decimal object to be able to specify an arbitrary radix>=2.

On Thu, Feb 8, 2018 at 8:49 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
The radix method always returns 10, because decimal.Decimal always operates in base 10. Are you looking for a way to change the way arithmetic is done, or are you looking for a way to construct a Decimal from a string of digits and an arbitrary base (the way int("...", x) does)? ChrisA

On Wed, Feb 07, 2018 at 10:08:50PM +0000, Neil Girdhar wrote:
Oh, and to answer your specific question, I want to change the way arithmetic is done. I want it to be done in a different radix.
Why? There are clear advantages to floating point arithmetic done in base 2 (speed, minimum possible rounding error, least amount of wobble), and a different advantage to floating point done in base 10 (matches exactly the standard decimal notation used by humans with no conversion error), Outside of those two bases, arithmetic done in any other base is going to combine the worst of both: - slower; - larger errors when converting from decimal numbers (in general); - larger rounding errors; - larger wobble; with no corresponding advantages unless your data is coming to you in arbitrary bases. Doing floating point arithmetic in decimal is already slower and less accurate than doing it in binary. I'd like to hear more about your use-case for doing it in base 19 or base 7, say, but I would have to guess that it is likely to be such a niche use-case that this functionality doesn't belong in the standard library. -- Steve

On Wed, Feb 7, 2018 at 5:52 PM Steven D'Aprano <steve@pearwood.info> wrote:
I don't see why it would have any of those problems. Base 10 isn't special in any way.
with no corresponding advantages unless your data is coming to you in arbitrary bases.
Right, I was playing with this problem ( https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-comput...) and wanted to work in base 2. I realize it's niche, but it's not exactly a significant change to the interface even if it's a big change to the implementation.

On Thu, Feb 8, 2018 at 10:08 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
Base 10 *is* special, because it corresponds to what humans use. In binary floating-point, you get weird results (by human standards) like 0.1+0.2 not being 0.3; that doesn't happen in decimal. There is no error when converting from a string of decimal digits to a decimal.Decimal, so presumably to avoid error, you'd have to work with digits in the same base. The rounding errors and wobble are by comparison with binary; you get the same problems in any other base, without the benefit of human-friendly behaviour.
You should be able to use the native float type for binary floating-point. But the whole point of that challenge is that you shouldn't need a computer. ChrisA

On Wed, Feb 7, 2018 at 6:36 PM Chris Angelico <rosuav@gmail.com> wrote:
I see your list was about converting to and from base 10. That wasn't really intended in my proposal. I meant wholly working in another base. In that sense, 10 isn't particularly "fast, error-free, better at rounding, etc."
Yeah, I know, but I wanted to play with it. Anyway, native floats don't help.

On Thu, Feb 8, 2018 at 10:49 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
Sounds like performance isn't going to be a big problem, then. You can manage with a non-optimized and naive implementation. So here's a couple of things to try: 1) Check out PyPI and see if something like what you want exists. 2) Poke around in the source code for the Decimal class (ignore the C module and use the pure Python one) and see if you can hack on it. It'd then be off-topic for python-ideas, but it'd be an awesome topic to discuss on python-list. Exploration is great fun, and Python's a great language to explore with. ChrisA

On Wed, Feb 7, 2018 at 3:49 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
I maintain gmpy2 and it might do what you want (arbitrary precision radix-2 arithmetic and easy access to the bits).
Historical memory - I once wrote a radix-6 fixed point library to explore an extension of the 3n+1 problem to rational numbers. It was written in Turbo Pascal and ran for days on a 286/287 PC. casevh

On Wed, Feb 07, 2018 at 11:49:27PM +0000, Neil Girdhar wrote:
I never said it was. Base 2 floats is the one that is faster and better at rounding than any other base. No finite precision floating point arithmetic can be error free, but all else being equal, base 2 minimises the errors you get. The advantage of base 10 is that it matches the standard base 10 numbers we write. Within the boundaries of the available precision, if you can write it in decimal, you can represent it in a decimal float. That isn't necessarily true of decimal -> binary floats.
Why not? If you can write the float in binary exactly, you can write it in hex, and use float.fromhex() to convert it exactly (provided it fits into the 64-bit floats Python uses). -- Steve

On 2/7/2018 6:08 PM, Neil Girdhar wrote:
This is the specialness of base 10
I don't see why it would have any of those problems.
Any base other than 2 has decreased speed (on a binary computer) and increased computational rounding errors and wobble.
Base 10 isn't special in any way.
Except as noted above and the fact that computation with binary coded decimal goes back to the early days of electronic computation.
with no corresponding advantages unless your data is coming to you in arbitrary bases.
In cpython, decimal uses _cdecimal for speed. I suspect that 10 is not only explicitly hard-coded as the base but implicitly hard-coded by using algorithm tricks that depend on and only work when the base is 10. -- Terry Jan Reedy

On Wed, Feb 7, 2018 at 10:50 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Well, there are a couple of other bases that are potentially interesting. There are some compelling mathematical advantages to using ternary (or even better, balanced ternary). Knuth calls balanced ternary "Perhaps the prettiest number system of all" in TAOCP, and ternary is in some sense the most efficient base to use; the article "Third Base" by Brian Hayes (Sci. Am., 2001) gives a nice overview. It's completely inappropriate for binary hardware, of course. And base-16 floating-point is still used in current IBM hardware, but I don't know whether that's purely for historical/backwards-compatibility reasons, or because it's faster for the FPU. As to making decimal support arbitrary radix, though: I don't see that happening any time soon. The amount of work would be enormous, for questionable gain. -- Mark

On Thu, Feb 8, 2018 at 8:49 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
The radix method always returns 10, because decimal.Decimal always operates in base 10. Are you looking for a way to change the way arithmetic is done, or are you looking for a way to construct a Decimal from a string of digits and an arbitrary base (the way int("...", x) does)? ChrisA

On Wed, Feb 07, 2018 at 10:08:50PM +0000, Neil Girdhar wrote:
Oh, and to answer your specific question, I want to change the way arithmetic is done. I want it to be done in a different radix.
Why? There are clear advantages to floating point arithmetic done in base 2 (speed, minimum possible rounding error, least amount of wobble), and a different advantage to floating point done in base 10 (matches exactly the standard decimal notation used by humans with no conversion error), Outside of those two bases, arithmetic done in any other base is going to combine the worst of both: - slower; - larger errors when converting from decimal numbers (in general); - larger rounding errors; - larger wobble; with no corresponding advantages unless your data is coming to you in arbitrary bases. Doing floating point arithmetic in decimal is already slower and less accurate than doing it in binary. I'd like to hear more about your use-case for doing it in base 19 or base 7, say, but I would have to guess that it is likely to be such a niche use-case that this functionality doesn't belong in the standard library. -- Steve

On Wed, Feb 7, 2018 at 5:52 PM Steven D'Aprano <steve@pearwood.info> wrote:
I don't see why it would have any of those problems. Base 10 isn't special in any way.
with no corresponding advantages unless your data is coming to you in arbitrary bases.
Right, I was playing with this problem ( https://brilliant.org/weekly-problems/2017-10-02/advanced/?problem=no-comput...) and wanted to work in base 2. I realize it's niche, but it's not exactly a significant change to the interface even if it's a big change to the implementation.

On Thu, Feb 8, 2018 at 10:08 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
Base 10 *is* special, because it corresponds to what humans use. In binary floating-point, you get weird results (by human standards) like 0.1+0.2 not being 0.3; that doesn't happen in decimal. There is no error when converting from a string of decimal digits to a decimal.Decimal, so presumably to avoid error, you'd have to work with digits in the same base. The rounding errors and wobble are by comparison with binary; you get the same problems in any other base, without the benefit of human-friendly behaviour.
You should be able to use the native float type for binary floating-point. But the whole point of that challenge is that you shouldn't need a computer. ChrisA

On Wed, Feb 7, 2018 at 6:36 PM Chris Angelico <rosuav@gmail.com> wrote:
I see your list was about converting to and from base 10. That wasn't really intended in my proposal. I meant wholly working in another base. In that sense, 10 isn't particularly "fast, error-free, better at rounding, etc."
Yeah, I know, but I wanted to play with it. Anyway, native floats don't help.

On Thu, Feb 8, 2018 at 10:49 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
Sounds like performance isn't going to be a big problem, then. You can manage with a non-optimized and naive implementation. So here's a couple of things to try: 1) Check out PyPI and see if something like what you want exists. 2) Poke around in the source code for the Decimal class (ignore the C module and use the pure Python one) and see if you can hack on it. It'd then be off-topic for python-ideas, but it'd be an awesome topic to discuss on python-list. Exploration is great fun, and Python's a great language to explore with. ChrisA

On Wed, Feb 7, 2018 at 3:49 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
I maintain gmpy2 and it might do what you want (arbitrary precision radix-2 arithmetic and easy access to the bits).
Historical memory - I once wrote a radix-6 fixed point library to explore an extension of the 3n+1 problem to rational numbers. It was written in Turbo Pascal and ran for days on a 286/287 PC. casevh

On Wed, Feb 07, 2018 at 11:49:27PM +0000, Neil Girdhar wrote:
I never said it was. Base 2 floats is the one that is faster and better at rounding than any other base. No finite precision floating point arithmetic can be error free, but all else being equal, base 2 minimises the errors you get. The advantage of base 10 is that it matches the standard base 10 numbers we write. Within the boundaries of the available precision, if you can write it in decimal, you can represent it in a decimal float. That isn't necessarily true of decimal -> binary floats.
Why not? If you can write the float in binary exactly, you can write it in hex, and use float.fromhex() to convert it exactly (provided it fits into the 64-bit floats Python uses). -- Steve

On 2/7/2018 6:08 PM, Neil Girdhar wrote:
This is the specialness of base 10
I don't see why it would have any of those problems.
Any base other than 2 has decreased speed (on a binary computer) and increased computational rounding errors and wobble.
Base 10 isn't special in any way.
Except as noted above and the fact that computation with binary coded decimal goes back to the early days of electronic computation.
with no corresponding advantages unless your data is coming to you in arbitrary bases.
In cpython, decimal uses _cdecimal for speed. I suspect that 10 is not only explicitly hard-coded as the base but implicitly hard-coded by using algorithm tricks that depend on and only work when the base is 10. -- Terry Jan Reedy

On Wed, Feb 7, 2018 at 10:50 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Well, there are a couple of other bases that are potentially interesting. There are some compelling mathematical advantages to using ternary (or even better, balanced ternary). Knuth calls balanced ternary "Perhaps the prettiest number system of all" in TAOCP, and ternary is in some sense the most efficient base to use; the article "Third Base" by Brian Hayes (Sci. Am., 2001) gives a nice overview. It's completely inappropriate for binary hardware, of course. And base-16 floating-point is still used in current IBM hardware, but I don't know whether that's purely for historical/backwards-compatibility reasons, or because it's faster for the FPU. As to making decimal support arbitrary radix, though: I don't see that happening any time soon. The amount of work would be enormous, for questionable gain. -- Mark
participants (7)
-
Case Van Horsen
-
Chris Angelico
-
Greg Ewing
-
Mark Dickinson
-
Neil Girdhar
-
Steven D'Aprano
-
Terry Reedy