Giving Decimal a global context was a mistake?
Broken off from the "Custom literals, a la C++" thread: Greg Ewing wrote:
Personally I think giving Decimal a global context was a mistake, [...] so arguing that "it's no worse than Decimal" isn't going to do much to convince me. :-)
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
On Wed, 6 Apr 2022 at 09:59, Mark Dickinson via Python-ideas <python-ideas@python.org> wrote:
Broken off from the "Custom literals, a la C++" thread:
Greg Ewing wrote:
Personally I think giving Decimal a global context was a mistake, [...] so arguing that "it's no worse than Decimal" isn't going to do much to convince me. :-)
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
One possibility is to attach the context information to the instances so it's like: ctx = context(precision=10, ...) x = ctx.new('1.2') y = ctx.new('2.3') z = x / y # rounds to 10 digits Of course there are many complications here when you think about mixing numbers that have different contexts so you'd need to decide how to handle that. One possibility would be simply to disallow mixing instances with different contexts and require explicit conversions. Realistically do many users want to use many different contexts and regularly switch between them? I expect the common use case is wanting to do everything in a particular context that just isn't the default one. -- Oscar
On Wed, Apr 6, 2022 at 5:11 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote: > > >Personally I think giving Decimal a global context was a mistake, [...] > I agree here -- though honestly, I think the benefits of Decimal are often misrepresented. Interestingly, the variable precision is the real bonus, not the decimal representation. Anyhoo ... > > I'd be curious to know what alternatives you see. When a user writes `x > + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal > module needs to know what precision to compute the result to What do other variable precision systems do? With a quick read, it looks like gmpy2 does something similar :-( > (as well as what rounding mode to use, etc.). Absent a thread-local > context or task-local context, where would that precision information come > from? > Why is that absent? -- it seems a task local and/or thread local context is exactly what should be done. > One possibility is to attach the context information to the instances > so it's like: > That seems the obvious thing to me -- a lot more like we already have with mixing integers and floats, and/or mixing different precision floats in other languages (and numpy). Different, as this wouldn't be type based, but if clear rules are established, then it would be do-able. At least it would probably fit the maxim: The easy stuff should be easy, the hard stuff should be possible. Perhaps even something as simple as "Preserve the precision of the highest precision operand" would go a long way. Realistically do many users want to use many different contexts and > regularly switch between them? I expect the common use case is wanting > to do everything in a particular context that just isn't the default > one. > I don't know that that's true in the least -- sure, for a basic script, absolutely, but PYthon has become a large ecosystem of third party packages -- people make a LOT of large systems involving many complex third party packages -- the builder of the system may not even know a package is using Decimals -- let alone two different third party packages using them in very different ways -- it's literally impossible for the developer of package A to know how package B works or that someone might be using both. Then put all this behind a multithreading web server, and you have a recipe for chaos. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, 7 Apr 2022 at 02:17, Christopher Barker <pythonchb@gmail.com> wrote:
On Wed, Apr 6, 2022 at 5:11 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
Why is that absent? -- it seems a task local and/or thread local context is exactly what should be done.
https://docs.python.org/3/library/decimal.html#decimal.localcontext The Decimal module has a global default context, and per-thread contexts which can be set permanently or with a context manager. It never offers per-module or per-function or any other scope of configuration; at any given moment, there is precisely one context for code running in any particular thread.
One possibility is to attach the context information to the instances so it's like:
That seems the obvious thing to me -- a lot more like we already have with mixing integers and floats, and/or mixing different precision floats in other languages (and numpy).
Not so obvious to me, as it would require inordinate amounts of fiddling around when you want to dynamically adjust your precision. You'd have to reassign every Decimal instance to have the new settings. Also: what happens when there's a conflict? Which one wins? Let's say you do "a + b" where the two were created with different contexts - do you use the lower precision? the higher precision? What about rounding settings? Do you need a meta-config that explains how to resolve conflicts between configs? Where would that be stored? Maybe the current way seems more obvious to me since I come from a background of working in REXX, which also had global configuration of these sorts of things (eg "NUMERIC DIGITS 1234" to set the precision, "NUMERIC FUZZ 2" to make numbers compare equal if close). It just seems like the simplest and most convenient way to do things.
Perhaps even something as simple as "Preserve the precision of the highest precision operand" would go a long way.
And some people will loudly dispute that, wanting to avoid false precision. "Adopt the precision of the lowest precision operand" is, for many purposes, much more sane. You'll never satisfy everyone.
Realistically do many users want to use many different contexts and regularly switch between them? I expect the common use case is wanting to do everything in a particular context that just isn't the default one.
I don't know that that's true in the least -- sure, for a basic script, absolutely, but PYthon has become a large ecosystem of third party packages -- people make a LOT of large systems involving many complex third party packages -- the builder of the system may not even know a package is using Decimals -- let alone two different third party packages using them in very different ways -- it's literally impossible for the developer of package A to know how package B works or that someone might be using both.
Indeed. But I don't hear people complaining that they need to have per-module Decimal contexts, possibly since it's never actually a module-by-module consideration. The one thing that threaded contexts don't handle is asyncio, and I haven't checked this, but I believe that "with Decimal.localcontext() as ctx:" uses an asyncio-aware definition of "thread-local" that actually allows multiple tasks to have independent contexts. That would mean that this sort of thing will work sanely: async def task1(): with Decimal.localcontext() as ctx: ctx.prec = 100 await something() a = b + c async def task2(): with Decimal.localcontext() as ctx: ctx.prec = 4 await somethingelse() a = b + c and regardless of exactly what each task does, there's a guarantee that code executed inside the 'with' blocks has the appropriate context. And that's true even if parts of it are imported from other modules. I've never used asyncio + Decimal.localcontext, so I can't say further on that.
Then put all this behind a multithreading web server, and you have a recipe for chaos.
That's handled, since Decimal contexts are set on a per-thread basis. ChrisA
On Wed, 6 Apr 2022 at 17:48, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, 7 Apr 2022 at 02:17, Christopher Barker <pythonchb@gmail.com> wrote:
On Wed, Apr 6, 2022 at 5:11 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
One possibility is to attach the context information to the instances so it's like:
That seems the obvious thing to me -- a lot more like we already have with mixing integers and floats, and/or mixing different precision floats in other languages (and numpy).
Not so obvious to me, as it would require inordinate amounts of fiddling around when you want to dynamically adjust your precision. You'd have to reassign every Decimal instance to have the new settings.
Why do you need to "dynamically adjust your precision"? Note that round/quantize give you the control that is most likely needed in decimal calculations without changing the context: rounding to fixed point including with precise control of rounding mode. If you're otherwise using the decimal module in place of a scientific multiprecision library then I think it's not really the right tool for the job. It's unfortunate that the docs suggest making functions like sin and cos: there is no good reason to use decimal over binary for transcendental or irrational numbers or functions.
Also: what happens when there's a conflict? Which one wins? Let's say you do "a + b" where the two were created with different contexts - do you use the lower precision? the higher precision? What about rounding settings?
I suggested simply disallowing this. If I really care about having each operation use the right context then I'll be happy to see an error message if I mess up like this by forgetting to convert in the right place. It is possible to be explicit about which context you want to use by using context methods like ctx.divide. From a quick skim of the decimal docs I don't see a single example showing how to use these rather than modify/replace the global contexts. though. Conceptually using the context's divide method is appropriate since it is the operation (divide) that the context affects.
from decimal import Context ctx = Context(prec=2) ctx.divide(1, 3) Decimal('0.33')
Realistically do many users want to use many different contexts and regularly switch between them? I expect the common use case is wanting to do everything in a particular context that just isn't the default one.
I don't know that that's true in the least -- sure, for a basic script, absolutely, but PYthon has become a large ecosystem of third party packages -- people make a LOT of large systems involving many complex third party packages -- the builder of the system may not even know a package is using Decimals -- let alone two different third party packages using them in very different ways -- it's literally impossible for the developer of package A to know how package B works or that someone might be using both.
Indeed. But I don't hear people complaining that they need to have per-module Decimal contexts, possibly since it's never actually a module-by-module consideration.
If packages A and B are using decimal module contexts without their users knowing then I should hope that each is very careful about messing with the global contexts to avoid interfering with each other as well as anything that the user does. If I wrote a library that does this I probably would use the context methods like ctx.divide(a, b) just to be sure about things. -- Oscar
On Thu, 7 Apr 2022 at 05:37, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Wed, 6 Apr 2022 at 17:48, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, 7 Apr 2022 at 02:17, Christopher Barker <pythonchb@gmail.com> wrote:
On Wed, Apr 6, 2022 at 5:11 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
One possibility is to attach the context information to the instances so it's like:
That seems the obvious thing to me -- a lot more like we already have with mixing integers and floats, and/or mixing different precision floats in other languages (and numpy).
Not so obvious to me, as it would require inordinate amounts of fiddling around when you want to dynamically adjust your precision. You'd have to reassign every Decimal instance to have the new settings.
Why do you need to "dynamically adjust your precision"?
Some algorithms work just fine when you start with X digits of precision and then increase that to X+Y digits later on (simple example: Newton's method for calculating square roots).
Note that round/quantize give you the control that is most likely needed in decimal calculations without changing the context: rounding to fixed point including with precise control of rounding mode.
If you're otherwise using the decimal module in place of a scientific multiprecision library then I think it's not really the right tool for the job. It's unfortunate that the docs suggest making functions like sin and cos: there is no good reason to use decimal over binary for transcendental or irrational numbers or functions.
Suppose you want to teach people how sin and cos are calculated. What would YOU recommend? Python already comes with an arbitrary-precision numeric data type. Do we need to use something else?
Also: what happens when there's a conflict? Which one wins? Let's say you do "a + b" where the two were created with different contexts - do you use the lower precision? the higher precision? What about rounding settings?
I suggested simply disallowing this. If I really care about having each operation use the right context then I'll be happy to see an error message if I mess up like this by forgetting to convert in the right place.
That would make the above exercise extremely annoying. I'm sure you'd like it for your use case, but I would hate it for mine, and if it's disallowed at the language level, that's about as global a choice as it can ever be.
It is possible to be explicit about which context you want to use by using context methods like ctx.divide. From a quick skim of the decimal docs I don't see a single example showing how to use these rather than modify/replace the global contexts. though. Conceptually using the context's divide method is appropriate since it is the operation (divide) that the context affects.
from decimal import Context ctx = Context(prec=2) ctx.divide(1, 3) Decimal('0.33')
So, not only does your proposal make things harder for some use cases, it also sacrifices all use of operators? What's the advantage, here?
Realistically do many users want to use many different contexts and regularly switch between them? I expect the common use case is wanting to do everything in a particular context that just isn't the default one.
I don't know that that's true in the least -- sure, for a basic script, absolutely, but PYthon has become a large ecosystem of third party packages -- people make a LOT of large systems involving many complex third party packages -- the builder of the system may not even know a package is using Decimals -- let alone two different third party packages using them in very different ways -- it's literally impossible for the developer of package A to know how package B works or that someone might be using both.
Indeed. But I don't hear people complaining that they need to have per-module Decimal contexts, possibly since it's never actually a module-by-module consideration.
If packages A and B are using decimal module contexts without their users knowing then I should hope that each is very careful about messing with the global contexts to avoid interfering with each other as well as anything that the user does. If I wrote a library that does this I probably would use the context methods like ctx.divide(a, b) just to be sure about things.
Local contexts exist for a reason. They just aren't *module* contexts, because that doesn't actually help anyone. ChrisA
On Wed, 6 Apr 2022 at 21:47, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, 7 Apr 2022 at 05:37, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Not so obvious to me, as it would require inordinate amounts of fiddling around when you want to dynamically adjust your precision. You'd have to reassign every Decimal instance to have the new settings.
Why do you need to "dynamically adjust your precision"?
Some algorithms work just fine when you start with X digits of precision and then increase that to X+Y digits later on (simple example: Newton's method for calculating square roots).
This is why I followed up by saying that the decimal module should not really be used in place of a scientific multiprecision library. Calculating irrational square roots is precisely the kind of thing that decimal floating point is not needed for. Do you know of a real example where this pattern is used in a situation that actually needs decimal (rather than binary) floating point?
If you're otherwise using the decimal module in place of a scientific multiprecision library then I think it's not really the right tool for the job. It's unfortunate that the docs suggest making functions like sin and cos: there is no good reason to use decimal over binary for transcendental or irrational numbers or functions.
Suppose you want to teach people how sin and cos are calculated. What would YOU recommend? Python already comes with an arbitrary-precision numeric data type. Do we need to use something else?
I would teach this using ordinary floats in the first instance since that's how cos is calculated most of the time e.g. that's what the math module does (and numpy etc): In [31]: x = 0.5 In [32]: sum((-1)**(n//2)*x**n/factorial(n) for n in range(0, 20, 2)) Out[32]: 0.8775825618903728 In [33]: cos(x) Out[33]: 0.8775825618903728 Certainly I have used the decimal module to demonstrate precisely the example you gave above (computing sqrt(2) with Newton's method) to show the idea that we can compute arbitrarily accurate results. That's not really what the decimal module is for though and I could have just as easily used something else (gmpy2/mpmath/sympy etc). The only advantage of the decimal module in that situation is just that it happens to be in the stdlib so I can demonstrate the code and hope that others can easily reproduce it. -- Oscar
On Thu, 7 Apr 2022 at 07:22, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Wed, 6 Apr 2022 at 21:47, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, 7 Apr 2022 at 05:37, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Not so obvious to me, as it would require inordinate amounts of fiddling around when you want to dynamically adjust your precision. You'd have to reassign every Decimal instance to have the new settings.
Why do you need to "dynamically adjust your precision"?
Some algorithms work just fine when you start with X digits of precision and then increase that to X+Y digits later on (simple example: Newton's method for calculating square roots).
This is why I followed up by saying that the decimal module should not really be used in place of a scientific multiprecision library. Calculating irrational square roots is precisely the kind of thing that decimal floating point is not needed for.
Do you know of a real example where this pattern is used in a situation that actually needs decimal (rather than binary) floating point?
Teaching IS a use-case with Python. Does it actually need binary rather than decimal floating point? Why should I fetch a third-party library rather than use what exists?
Suppose you want to teach people how sin and cos are calculated. What would YOU recommend? Python already comes with an arbitrary-precision numeric data type. Do we need to use something else?
I would teach this using ordinary floats in the first instance since that's how cos is calculated most of the time e.g. that's what the math module does (and numpy etc):
In [31]: x = 0.5
In [32]: sum((-1)**(n//2)*x**n/factorial(n) for n in range(0, 20, 2)) Out[32]: 0.8775825618903728
In [33]: cos(x) Out[33]: 0.8775825618903728
That's fair, and always a good place to start, but if you want more precision than that, where do you go?
Certainly I have used the decimal module to demonstrate precisely the example you gave above (computing sqrt(2) with Newton's method) to show the idea that we can compute arbitrarily accurate results. That's not really what the decimal module is for though and I could have just as easily used something else (gmpy2/mpmath/sympy etc). The only advantage of the decimal module in that situation is just that it happens to be in the stdlib so I can demonstrate the code and hope that others can easily reproduce it.
Exactly. It's very easy to do demos and exploration that require nothing more than the core language and standard library. No need to walk people through the use of pip, running into platform differences, etc. The Decimal module is great for that. What do the specialized libraries offer that Decimal doesn't? Is it really worth all that hassle for the sake of teaching something that, in production, would just be done with floats and the math module anyway? ChrisA
On 2022-04-06 14:28, Chris Angelico wrote:
What do the specialized libraries offer that Decimal doesn't? Is it really worth all that hassle for the sake of teaching something that, in production, would just be done with floats and the math module anyway?
If the decimal module did not already have a global context, would you suggest adding it just in order to support this sort of teaching use-case? It seems to me that the general claim about global context being a mistake is that it can create pitfalls. In that sense, the global context already is a "hassle" that is imposed on people. I don't see that the potentially-offsetting benefit of (as you say) teaching something that would actually just be done with floats is really that compelling. It's nice, sure, but I also don't see any problem with saying "sorry, you'll have to install a third-party module if you want to teach some mathematical background about how some operations work, because we only put into the stdlib what was necessary to actually do those operations, not teach and demo their underpinnings". -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Thu, 7 Apr 2022 at 07:41, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2022-04-06 14:28, Chris Angelico wrote:
What do the specialized libraries offer that Decimal doesn't? Is it really worth all that hassle for the sake of teaching something that, in production, would just be done with floats and the math module anyway?
If the decimal module did not already have a global context, would you suggest adding it just in order to support this sort of teaching use-case?
If it didn't have any sort of precision changing, I wouldn't be using it, as it wouldn't offer anything that floats don't (in terms of teaching - obviously decimal rather than binary floats have specific use-cases, but mine isn't one of them). If the context feature didn't exist, I would assume that precision is configured globally in some way, and that would be absolutely fine with me.
It seems to me that the general claim about global context being a mistake is that it can create pitfalls. In that sense, the global context already is a "hassle" that is imposed on people. I don't see that the potentially-offsetting benefit of (as you say) teaching something that would actually just be done with floats is really that compelling. It's nice, sure, but I also don't see any problem with saying "sorry, you'll have to install a third-party module if you want to teach some mathematical background about how some operations work, because we only put into the stdlib what was necessary to actually do those operations, not teach and demo their underpinnings".
I don't understand. What do you mean by pitfalls, and how else would you do variable-precision in the standard library? The most obvious way would simply be "decimal.set_precision(500)" which would, obviously, be completely global. The way it currently is, "decimal.getcontext().prec = 500", isn't very much harder, and permits non-global contexts, but either way, a global context is a good idea for many applications. Yes, there are cases where you need other contexts, but eliminating the global default context creates a HUGE set of hassles. That's why, for instance, the random module has a set of default functions which come from a single global Random object - and the vast majority of applications don't need to create dedicated Random objects, because the global is the right choice. The tools are there for when you need more flexibility, but when you don't need them, you don't have to worry about them. Practicality beats purity. ChrisA
On 7/04/22 9:53 am, Chris Angelico wrote:
how else would you do variable-precision in the standard library?
Maybe the mistake was in thinking that we need variable precision at all. If the goal of Decimal was to provide arithmetic that "works like your calculator", well, most calculators have a fixed precision of 10 or so digits, which seems to be fine for most things. So why *do* we have variable precision in Decimal? -- Greg
On Thu, 7 Apr 2022 at 16:03, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 7/04/22 9:53 am, Chris Angelico wrote:
how else would you do variable-precision in the standard library?
Maybe the mistake was in thinking that we need variable precision at all.
If the goal of Decimal was to provide arithmetic that "works like your calculator", well, most calculators have a fixed precision of 10 or so digits, which seems to be fine for most things.
So why *do* we have variable precision in Decimal?
Because it's incredibly useful to do highly precise calculations, but not very helpful to make them arbitrarily expensive, so it's good to be able to choose the tradeoff between computational cost and numeric accuracy. ChrisA
On Wed, Apr 6, 2022 at 11:03 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Maybe the mistake was in thinking that we need variable precision at all.
I actually have almost the opposite position -- variable precision is the most useful part of the Decimal type ;-) Warning: rant ahead: skip to the end for the relevant bit: I'm still completely confused as to why folks think floating point decimal is any better than floating point binary: the only explanation is that : "computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school". That is to say, people "expect" to be able to exactly represent 1/10, but don't expect to be able to exactly represent 1/3, or, indeed, any irrational number. I'm pretty sure (I haven't thought out every edge case) that numbers using binary internally that always rounded a little bit on display and comparison would behave as people "expect" just as well. Note the docs again: "End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point." -- so it's really about the display, not the precision or accuracy of the result. NOTE: I'm not advocating a change, but while I understand why: In [55]: repr(1.1 + 2.2) Out[55]: '3.3000000000000003' I don't get why: In [54]: str(1.1 + 2.2) Out[54]: '3.3000000000000003' Isn't the point of __str__ to provide a more human-readable, but perhaps not reproducable, representation? Anyway... And it's really a mistake to think that Decimal is inherently any better suited to money. From the docs: """ The exactness carries over into arithmetic. In decimal floating point, 0.1 + 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result is 5.5511151231257827e-017. While near to zero, the differences prevent reliable equality testing and differences can accumulate. For this reason, decimal is preferred in accounting applications which have strict equality invariants. """ I'm no accountant, but this strikes me as quite dangerous -- sure decimal fractions are exact, but who says you are only doing decimal arithmetic? Calculating interest, inflation, who knows what could easily introduce non-exactly-representable-in-decimal numbers. And do accounting systems really use floating point decimal dollars, rather than, say fixed point or integer cents? I also notice that in the financial world, there's a lot of use of binary fractions: interest rates tend to be in eights of a percent, not tenths of a percent, for example. So what does Decimal provide? Two things that you can't do with the built-in (hardware) float: Variable precision Control of rounding Which does make it more suitable for accounting and other applications, but not because the internal implementation is decimal rather than binary. BTW: it seems a "round the least significant digit on comparison" mode would be handy End rant -- not really that relevant anyway. The relevant bit -- it seems that someone could write an accounting module that utilized Decimal to precisely follow a particular set of accounting rules (it's probably been done). But in that case, you'd want to be darn sure that the specific context was used in that package -- not any global setting that a user of the package, or some other package, might mess with? So what's the point of a global context? Isn't it an accident waiting to happen? -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Fri, 8 Apr 2022 at 03:16, Christopher Barker <pythonchb@gmail.com> wrote:
I'm no accountant, but this strikes me as quite dangerous -- sure decimal fractions are exact, but who says you are only doing decimal arithmetic? Calculating interest, inflation, who knows what could easily introduce non-exactly-representable-in-decimal numbers. And do accounting systems really use floating point decimal dollars, rather than, say fixed point or integer cents? I also notice that in the financial world, there's a lot of use of binary fractions: interest rates tend to be in eights of a percent, not tenths of a percent, for example.
Good question; but I do know that one limitation of the classic "use integer cents for money" solution is that you can't do forex that way. Right now, for instance, one AUD is worth 0.74770 USD (although while I typed that, it changed to 0.74773 USD). Would have to ask someone in high finance about that; do they think in terms of fixed point still (ie there's always precisely 1000 subdivisions of a cent), or do they think in terms of individual transactions ("here, I'll sell you 29_852_359 AUD, you give me 22_320_610 USD") and what we call exchange rates is all just rounded anyway? Decimal floating-point certainly has its uses, but it isn't the perfect solution to all things financial. But then, NOTHING is the perfect solution to all things financial, as evidenced by the state of the world we're in.... ChrisA
[Christopher Barker <pythonchb@gmail.com>]
So what does Decimal provide? Two things that you can't do with the built-in (hardware) float:
Variable precision Control of rounding
And a third: precisely defined results in base 10. You're thinking like an engineer, not an accountant ;-) Politicians don't know anything about computer arithmetic, and mountains of regulations predate widespread computer use. Because of the need for businesses to comply with mountains of regulations written by people for whom working out small decimal examples by hand was peak knowledge, COBOL _required_ decimal arithmetic (although in fixed point). COBOL access to binary floats was added later with the obscure (to COBOL programmers!) COMP-1 and COMP-2 types (4- and 8--byte binary floats);. And a fourth: a concept of significant trailing zeroes.
from decimal import Decimal as D D("3.000") Decimal('3.000') _ * D("5.4456") Decimal('16.3368000')
Absolutely true that decimal isn't "more accurate" than binary. It saves users from worlds of shallow surprises, mostly related to string representations. But it's not _really_ what "people expect". They're routinely surprised by, e.g., hand calculators too (which have historically used decimal arithmetic internally). The bundled Windows calculator app is probably the most widely used numeric toy on Earth, and over the years they worked to make it as unsurprising as possible. While it supplies no access to this via the UI, which displays base 10 floats, under the covers the current version uses unbounded rationals. So, e.g., pick any integer n you like, and in that app (1/n)*n is always 1 exactly. Python's predecessor (ABC) also converted float notation to rationals. But added HW floats later because chains of calculations with rationals too often make enormous memory demands. The Windows calculator generally doesn't have that problem, because it's not programmable. Do a few dozen calculations by hand, and it's unlikely rationals will "blow up".
... The relevant bit -- it seems that someone could write an accounting module that utilized Decimal to precisely follow a particular set of accounting rules : (it's probably been done).
That's a _primary_ use case for decimal arithmetic. For example, the monstrously large federal US tax code allows rounding to dollars, and the law uniformly defines what "rounding" means with reference to decimal: drop amounts under 50 cents and increase amounts from 50 to 99 cents to the next dollar. Regulation enforcers lack flexibility, common sense, and humor ;-) Banking apps more often require to-nearest/even rounding (which is where the name "banker's rounding" comes from).
I don't get why:
In [54]: str(1.1 + 2.2) Out[54]: '3.3000000000000003'
Isn't the point of __str__ to provide a more human-readable, but perhaps not reproducable, representation?
That str() and repr() _used_ to give different results was another very widespread source of shallow surprises. You cannot "win" this game. Now, both deliver the shortest decimal string `s` such that float(s) exactly reproduces the original float. Which spares users from a different class of shallow surprises: when they type in a float by hand, that's usually - by definition - the shortest string that can produce the resulting float. So it will display the same way they typed it. This is different from rounding to a fixed number of decimal places (which str() and repr() used to do).
But in that case, you'd want to be darn sure that the specific context was used in that package -- not any global setting that a user of the package, or some other package, might mess with?
Much ado about nothing ;-) Accounting and tax programmers are professionals too. and don't need to be saved from gross newbie mistakes. If some internal algorithm needs a specific decimal context, they'll write it from the start to force use of a local context.
There's theory and math, and then there's reality. In reality, some accounting systems use decimals with fixed precision for certain aspects and apply predefined rounding (usually defined in the contracts between the counterparties or in accounting/tax regulations), while others use IEEE 754 double precision floats. Rounding errors are dealt with by booking corrections where necessary. As an example, it's possible that VAT regulations mandate to do the VAT calculation at the per item level (including rounding at that level) and not at the summary level. This can result in significant differences when you have to deal with lots of small amounts. The VAT sum will diverge considerably from the VAT you'd do get from using the sum of the net items as basis - but this is intended. In high finance, I've never seen decimals being used, only floats. Excel is omnipresent, sets the standards and uses IEEE 754 floats as well (plus some black magic which sometimes helps, but often makes things worse): https://en.wikipedia.org/wiki/Numeric_precision_in_Microsoft_Excel As a result, there's no one-fits-all answer to decimal vs. floats. It depends on your use and the context in which you have to apply math operations. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2022)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Thu, Apr 7, 2022 at 2:47 PM Marc-Andre Lemburg <mal@egenix.com> wrote:
In high finance, I've never seen decimals being used, only floats. Excel is omnipresent, sets the standards and uses IEEE 754 floats as well (plus some black magic which sometimes helps, but often makes things worse):
In forex, instantaneous exchange rates are defined as a specific number of decimal digits in a currencyA/currencyB exchange rate (on a particular market). This is about US$6.6 trillion/day governed by these rules... FAR more than the combined size of ALL securities markets. It was something like 2007 when the NYSE moved from prices in 1/32 penny to a fixed-length decimal representation. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On 07.04.2022 20:55, David Mertz, Ph.D. wrote:
On Thu, Apr 7, 2022 at 2:47 PM Marc-Andre Lemburg <mal@egenix.com <mailto:mal@egenix.com>> wrote:
In high finance, I've never seen decimals being used, only floats. Excel is omnipresent, sets the standards and uses IEEE 754 floats as well (plus some black magic which sometimes helps, but often makes things worse):
In forex, instantaneous exchange rates are defined as a specific number of decimal digits in a currencyA/currencyB exchange rate (on a particular market).
This is about US$6.6 trillion/day governed by these rules... FAR more than the combined size of ALL securities markets.
It was something like 2007 when the NYSE moved from prices in 1/32 penny to a fixed-length decimal representation.
... and then you end up with problems such as these: https://www.wsj.com/articles/berkshire-hathaways-stock-price-is-too-much-for... (a good problem to have, BTW) Seriously, the actual trades will still use fixed numbers in many cases (after all, each trade is a legal contract), but everything else leading up to the trades tends to use floats: models and other decision making tools, pricing, risk calculation, hedging, etc. etc. But that's just one application space. There are plenty others where decimal are used. Still, to get back to the original topic, in most cases, a fixed high enough precision is usually enough to keep applications, users, authorities and regulators happy, so the concept of a global default works well. You can easily round decimals of this higher precision to whatever precision you need for a particular purpose or I/O, without changing the context precision. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2022)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 8/04/22 5:15 am, Christopher Barker wrote:
I actually have almost the opposite position -- variable precision is the most useful part of the Decimal type ;-)
So can you elaborate on how you use variable precision?
Note the docs again: "End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point." -- so it's really about the display, not the precision or accuracy of the result.
I don't think it's entirely about the display. It's also about things like sum([1/10] * 10) == 1 being False. This is where "human-friendly" display actually makes things worse, because repr() makes it *look* like 1/10 equals decimal 1.0, but it really doesn't.
Calculating interest, inflation, who knows what could easily introduce non-exactly-representable-in-decimal numbers.
Yes, but when it comes to the point of e.g. adding some interest to an account, the amount of interest needs to be an exact multiple of 0.01 dollars. And if you add up all the interest transferred during your nightly run, it had better exactly equal the amount taken out of the account the interest is being paid from. I do agree, however, that you don't need *floating point* for this, and fixed point would probably be better. BTW, there's an accounting package I work with that uses binary floating point for money, and seems to get away with it. Probably because everything that goes into the database gets rounded to a defined number of decimal places, so errors don't get a chance to accumulate to the point where they would cause a problem. It does make me a bit nervous, though. :-) -- Greg
[Greg Ewing <greg.ewing@canterbury.ac.nz>]
So can you elaborate on how you use variable precision?
Numeric programmers frequently want this. An extreme example is in function `collision_stats()` in your Python distribution's Lib/test/support/__init__.py. This needs to compute the variance of a distribution for which we only have a mathematical derivation. That's exact "in theory", but the expression suffers massive cancellation in native floating point. In context, it would deliver 100% gibberish results. The comments note that it's easy to use rationals (fractions.Fraction) instead - but doing so in context would require multi-million bit integer arithmetic and be unbearably slow (voice of experience there - that's how I wrote it at first, and wondered why test_tuple never finished). `decimal` to the rescue! It's fast and highly accurate now. It just sets the context to use a number of decimal digits twice the number of bits in the crucial input. There's still utterly massive cancellation, but it doesn't matter: there are enough "extra" decimal digits that losing mounds of the leading digits to cancellation doesn't matter to the result. Base 10 is irrelevant to this - base 2 would work fine too. It's the ability to greatly boost precision that matters. BTW, like any sane isolated code that messes with the context, it does so locally: with decimal.localcontext() as ctx: bits = n.bit_length() * 2 # bits in n**2 # At least that many bits will likely cancel out. # Use that many decimal digits instead. ctx.prec = max(bits, 30) I know of no other way to get this done with sane effort in core Python; and, e.g., we can't reasonably require installing the `mpmath` extension just to run Python's standard test suite.
On Thu, Apr 7, 2022 at 2:51 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't think it's entirely about the display. It's also about things like sum([1/10] * 10) == 1 being False.
sure, but: In [18]: sum([Decimal(1) / Decimal(3)] * 3) == Decimal(1) Out[18]: False Which is just as bad -- not as "confusing" to folks that are used to decimal, but I get a bad feeling when the docs talk about "exact" -- I'm not sure that the distinction that's it's only *decimal* fractions that are exact comes through for more casual readers. This is where
"human-friendly" display actually makes things worse, because repr() makes it *look* like 1/10 equals decimal 1.0, but it really
doesn't.
well, yes, that is a good point.
Calculating interest, inflation, who knows what could easily introduce non-exactly-representable-in-decimal numbers.
Yes, but when it comes to the point of e.g. adding some interest to an account, the amount of interest needs to be an exact multiple of 0.01 dollars. And if you add up all the interest transferred during your nightly run, it had better exactly equal the amount taken out of the account the interest is being paid from.
Sure -- but that requires careful rounding and all that -- I don't think Decimal makes that any easier, frankly, particularly if you use cents as your units, rather than dollars. (though my earlier point that Decimal does allow you to control the rounding is an important feature for these types of applications)
BTW, there's an accounting package I work with that uses binary floating point for money, and seems to get away with it. Probably because everything that goes into the database gets rounded to a defined number of decimal places, so errors don't get a chance to accumulate to the point where they would cause a problem. It does make me a bit nervous, though. :-)
Well -- the authors of that package seem to have demonstrated my point -- you need to take care with rounding and limited precision regardless -- and once you do that, binary is just as good :-) Are you sure you can trust it though? There's the old urban legend about the programmer for a bank writing the code so that the rounded pennies would go into his account -- it added up to a lot of money that nobody noticed was missing. The legend goes that he was only caught because the bank had a promotional event in which they drew a randomly selected account -- and found his. Are you SURE your accounting software is doing the right thing? ;-) -CHB Also -- if it uses 64 bit floats, it'll have problems with trillions of dollars :-) -- might lose track of some cents there .... -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 8/04/22 7:03 pm, Christopher Barker wrote:
Are you SURE your accounting software is doing the right thing? ;-)
Well, I've only ever seen precision problems manifest themselves once, and that was when I wrote a script that used repeated multiplications by 10 as part of a process to convert a number into words. I had to put some rounding steps into that to make it work properly. Other than that, if you were adding up about a billion monetary amounts in one go without any rounding, you might get a problem. I've never seen anyone do that, though. :-)
Also -- if it uses 64 bit floats, it'll have problems with trillions of dollars :-)
If your business is that big, you would not be using this particular accounting package! -- Greg
Because the module implements http://speleotrove.com/decimal/decarith.html On Thu, Apr 7, 2022 at 2:03 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 7/04/22 9:53 am, Chris Angelico wrote:
how else would you do variable-precision in the standard library?
Maybe the mistake was in thinking that we need variable precision at all.
If the goal of Decimal was to provide arithmetic that "works like your calculator", well, most calculators have a fixed precision of 10 or so digits, which seems to be fine for most things.
So why *do* we have variable precision in Decimal?
-- Greg _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OECXPU... Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On 8/04/22 6:01 am, David Mertz, Ph.D. wrote:
Because the module implements http://speleotrove.com/decimal/decarith.html <http://speleotrove.com/decimal/decarith.html>
Well, yes, but someone made the decision to implement that particular standard rather than one of the ones with fixed precision. I'm asking why *that* decision was made. -- Greg
[David Mertz]
Because the module implements http://speleotrove.com/decimal/decarith.html <http://speleotrove.com/decimal/decarith.html>
[Greg Ewing]
Well, yes, but someone made the decision to implement that particular standard rather than one of the ones with fixed precision. I'm asking why *that* decision was made.
Because Cowlishaw's spec isn't the only one in play. Essentially all of it was folded into a later revision of IEEE-754, which was generalized to specify base 10 floating point too. 754 utterly took over the world. Nobody wanted to bet against its successors. My own FixedPoint.py was adopted (& extended by others in various ways) by people who actually wanted fixed point in Python, but it remained a niche audience. Cowlishaw's spec _intended_ to subsume fixed point applications too, but I think he oversells the extent to which it succeeds at that. Yup, it can be done - but you're _forever_ doing manual rounding steps to maintain the illusion that you're actually using a fixed point system. Nevertheless, that's straightforward enough. Just tedious. And another reason: the idea that Python "supports 754" (even the original binary version) is just plain false. Not even close. 754 defines an elaborate numeric _enivonment_,(not just the results of arithmetic), and Python supports almost none of that. Supporting the elaborate signals and flags it requires is nearly impossible, because the C subsystems CPython builds on supply no portable ways to do so. But the `decimal` module intends to be a faithful implementation of _everything_ the relevant standards require. The behaviors of its signals and flags adhere to the standards, and regardless of platform. Which is feasible, because its implementation relies on its own code - not on HW primitives or platform C extensions - to handle all of that. A 754-conforming environment isn't really aimed at end users so much as at developers of mathematical libraries. All those "fiddly bits" can enormously simplify the lives of library developers, who know exactly what they're doing, and rejoice in having an environment that actively helps them instead of fighting them every bit of the way ("not defined", "no way to check that short of picking apart the platform-dependent bit representation", "na, the compiler has a different idea of which precision to use here - and _which_ idea may depend on how various compilation flags interact", "you asked for a trap on overflow? OK, but we may not raise it until billions of operations after an overflow occurs", "subnormal? na, we didn't bother implementing those", ad nauseum). The wholly conforming `decimal` environment can be a real joy to work in.
On 6/04/22 8:58 pm, Mark Dickinson via Python-ideas wrote:
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
I'm not sure, but my feeling is that if I want to limit results to a specific number of digits, I'm going to want much finer grained control, like specifying it for each individual operation. The current design doesn't fit any use case I can see myself needing. -- Greg
On 07.04.2022 02:41, Greg Ewing wrote:
On 6/04/22 8:58 pm, Mark Dickinson via Python-ideas wrote:
I'd be curious to know what alternatives you see. When a user writes `x + y` with both `x` and `y` instances of `decimal.Decimal`, the decimal module needs to know what precision to compute the result to (as well as what rounding mode to use, etc.). Absent a thread-local context or task-local context, where would that precision information come from?
I'm not sure, but my feeling is that if I want to limit results to a specific number of digits, I'm going to want much finer grained control, like specifying it for each individual operation. The current design doesn't fit any use case I can see myself needing.
GMP uses a smarter approach (https://gmplib.org/manual/Floating_002dpoint-Functions): - GMP floats are mutable objects - all floats have a variable precision and you can even adjust the precision after creation - operations put the result into an existing float object (with defined precision) Of course, this is not what a Python user would expect (numbers in Python are usually immutable), so using the approach directly would break "Python" intuition and likely cause many weird errors down the line. However, in practice, you rarely need decimals with more than 64 bits (or some other fixed upper limit) precision, so the global context works just fine and you can always adjust the precision for output purposes at the I/O boundaries of your application. The MPFR library, which uses a similar strategy for numbers as GMP, adds more flexibility by also providing a rounding context (https://www.mpfr.org/#intro). MPFR provides a global default rounding mode and also allows a per operation rounding mode. Again, applications will typically just use one rounding method for consistency purposes, so a global context works well in practice. Certain algorithms may require special handling of both precision and rounding, but for those, you can either use a thread-local or task-local context which you only enable while the algorithm is running. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Apr 07 2022)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
participants (9)
-
Brendan Barnwell
-
Chris Angelico
-
Christopher Barker
-
David Mertz, Ph.D.
-
Greg Ewing
-
Marc-Andre Lemburg
-
Mark Dickinson
-
Oscar Benjamin
-
Tim Peters