Python Float Update

Dear Python Developers: I will be presenting a modification to the float class, which will improve its speed and accuracy (reduce floating point errors). This is applicable because Python uses a numerator and denominator rather than a sign and mantissa to represent floats. First, I propose that a float's integer ratio should be accurate. For example, (1 / 3).as_integer_ratio() should return (1, 3). Instead, it returns(6004799503160661, 18014398509481984). Second of all, even though 1 * 3 = 3 (last example), 6004799503160661 * 3 does not equal 18014398509481984. Instead, it equals 1801439850948198**3**, one less than the value in the ratio. This means the ratio is inaccurate, as well as completely not simplified. [image: Inline image 1] Even if the value displayed for a float is a rounded value, the internal numerator and denominator should divide to equal to completely accurate value. Thanks for considering this improvement! Sincerely, u8y7541

On May 31, 2015 7:26 PM, "u8y7541 The Awesome Person" < surya.subbarao1@gmail.com> wrote:
Dear Python Developers:
I will be presenting a modification to the float class, which will
improve its speed and accuracy (reduce floating point errors). This is applicable because Python uses a numerator and denominator rather than a sign and mantissa to represent floats. Python's floats are in fact ieee754 floats, using sign/mantissa/exponent, as provided by all popular CPU floating point hardware. This is why you're getting the results you see -- 1/3 cannot be exactly represented as a float, so it gets rounded to the closest representable float, and then as_integer_ratio shows you an exact representation of this rounded value. It sounds like you're instead looking for an exact fraction representation, which in python is available in the standard "fractions" module: https://docs.python.org/3.5/library/fractions.html -n

On Mon, Jun 1, 2015 at 12:25 PM, u8y7541 The Awesome Person <surya.subbarao1@gmail.com> wrote:
I will be presenting a modification to the float class, which will improve its speed and accuracy (reduce floating point errors). This is applicable because Python uses a numerator and denominator rather than a sign and mantissa to represent floats.
First, I propose that a float's integer ratio should be accurate. For example, (1 / 3).as_integer_ratio() should return (1, 3). Instead, it returns(6004799503160661, 18014398509481984).
I think you're misunderstanding the as_integer_ratio method. That isn't how Python works internally; that's a service provided for parsing out float internals into something more readable. What you _actually_ are working with is IEEE 754 binary64. (Caveat: I have no idea what Python-the-language stipulates, nor what other Python implementations use, but that's what CPython uses, and you did your initial experiments with CPython. None of this discussion applies *at all* if a Python implementation doesn't use IEEE 754.) So internally, 1/3 is stored as: 0 <-- sign bit (positive) 01111111101 <-- exponent (1021) 0101010101010101010101010101010101010101010101010101 <-- mantissa (52 bits, repeating) The exponent is offset by 1023, so this means 1.010101.... divided by 2²; the original repeating value is exactly equal to 4/3, so this is correct, but as soon as it's squeezed into a finite-sized mantissa, it gets rounded - in this case, rounded down. That's where your result comes from. It's been rounded such that it fits inside IEEE 754, and then converted back to a fraction afterwards. You're never going to get an exact result for anything with a denominator that isn't a power of two. Fortunately, Python does offer a solution: store your number as a pair of integers, rather than as a packed floating point value, and all calculations truly will be exact (at the cost of performance):
This is possibly more what you want to work with. ChrisA

Teachable moments about the implementation of floating-point aside, something in this neighborhood has been considered and rejected before, in PEP 240. However, that was in 2001 - it was apparently created the same day as PEP 237, which introduced transparent conversion of machine ints to bignums in the int type. I think hiding hardware number implementations has been a success for integers - it's a far superior API. It could be for rationals as well. Has something like this thread's original proposal - interpeting decimal-number literals as fractional values and using fractions as the result of integer arithmetic - been seriously discussed more recently than PEP 240? If so, why haven't they been implemented? Perhaps enough has changed that it's worth reconsidering. On Sun, May 31, 2015 at 22:49 Chris Angelico <rosuav@gmail.com> wrote:

On Sun, May 31, 2015, at 23:21, Jim Witschey wrote:
I think hiding hardware number implementations has been a success for integers - it's a far superior API. It could be for rationals as well.
I'd worry about unbounded complexity. For rationals, unlike integers, values don't have to be large for their bignum representation to be large.
Also, it raises a question of string representation. Granted, "1/3" becomes much more defensible as the repr of Fraction(1, 3) if it in fact evaluates to that value, but how much do you like "6/5" as the repr of 1.2? Or are we going to use Fractions for integer division and Decimals for literals? And, what of decimal division? Right now you can't even mix Fraction and Decimal in arithmetic operations. And are we going to add %e %f and %g support for both types? Directly so, without any detour to float and its limitations (i.e. %.100f gets you 100 true decimal digits of precision)? Current reality:
Okay, that's one case right out of four.

On Sun, May 31, 2015 at 11:37 PM, <random832@fastmail.us> wrote:
I'd expect rational representations to be reasonably small until a value was operated on many times, in which case you're using more space, but representing the result very precisely. It's a tradeoff, but with a small cost in the common case. I'm no expert, though -- am I not considering some case?
how much do you like "6/5" as the repr of 1.2?
6/5 is an ugly representation of 1.2, but consider the current state of affairs:
1.2 1.2
"1.2" is imprecisely interpreted as 1.2000000476837158 * (2^0), which is then imprecisely represented as 1.2. I recognize this is the way we've dealt with non-integer numbers for a long time, but "1.2" => SomeKindOfRational(6, 5) => "6/5" is conceptually cleaner.
Or are we going to use Fractions for integer division and Decimals for literals?
I had been thinking of rationals built on bignums all around, a la Haskell. Is Fraction as it exists today up to it? I don't know. I agree that some principled decisions would have to be made for, e.g., interpretation by format strings.

On May 31, 2015, at 20:37, random832@fastmail.us wrote:
That's the big problem. There's no one always-right answer. If you interpret the literal 1.20 a Fraction, it's going to be more confusing, not less, to people who are just trying to add up dollars and cents. Do a long financial computation and, instead of $691.05 as you expected or $691.0500000237 as you get today, you've got 10215488088 / 14782560. Not to mention that financial calculations often tend to involve things like e or exponentiation to non-integral powers, and what happens then? And then of course there's the unbounded size issue. If you do a long chain of operations that can theoretically be represented exactly followed by one that can't, you're wasting a ton of time and space for those intermediate values (and, unlike Haskell, Python can't look at the whole expression in advance and determine what the final type will be). On other other hand, if you interpret 1.20 it as a Decimal, now you can't sensibly mix 1.20 * 3/4 without coming up with a rule for how decimal and fraction types should interact. (OK, there's an obvious right answer for multiplication, but what about for addition?) And either one leads to people asking why the code they ported from Java or Ruby is broken on Python. You could make it configurable, so integer division is your choice of float, fraction, or decimal and decimal literals are your separate choice of the same three (and maybe also let fraction exponentiation be your choice of decimal and float), but then which setting is the default? Also, where do you set that? It has to be available at compile time, unless you want to add new types like "decimal literal" at compile time that are interpreted appropriately at runtime (which some languages do, and it works, but it definitely adds complexity). Maybe the answer is just to make it easier to be explicit, using something like C++ literal suffixes, so you can write, e.g., 1.20d or 1/3f (and I guess 1.2f) instead of Decimal('1.20') or Fraction(1, 3) (and Fraction(12, 10)).
At least here I think the answer is clear. %-substitution is printf-like, and shouldn't change. If you want formatting that can be overloaded by the type, you use {}, which already works.

Having some sort of decimal literal would have some advantages of its own, for one it could help against this sillyness:
Decimal(1.3) Decimal('1.3000000000000000444089209850062616169452667236328125')
Decimal('1.3') Decimal('1.3')
I'm not saying that the actual data type needs to be a decimal ( might well be a float but say shove the string repr next to it so it can be accessed when needed) ..but this is one really common pitfall for new users, i know its easy to fix the code above, but this behavior is very unintuitive.. you essentially get a really expensive float when you do the obvious thing. Not sure if this is worth the effort but it would help smooth some corners potentially..

On 01/06/2015 15:52, Joonas Liik wrote:
Far easier to point them to https://docs.python.org/3/library/decimal.html and/or https://docs.python.org/3/tutorial/floatingpoint.html -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

On Mon, Jun 01, 2015 at 05:52:35PM +0300, Joonas Liik wrote:
Why is that silly? That's the actual value of the binary float 1.3 converted into base 10. If you want 1.3 exactly, you can do this:
Decimal('1.3') Decimal('1.3')
Is that really so hard for people to learn?
You want Decimals to *lie* about what value they have? I think that's a terrible idea, one which would lead to a whole set of new and exciting surprises when using Decimal. Let me try to predict a few of the questions on Stackoverflow which would follow this change... Why is equality so inaccurate in Python? py> x = Decimal(1.3) py> y = Decimal('1.3') py> x, y (Decimal('1.3'), Decimal('1.3')) py> x == y False Why does Python insert extra digits into numbers when I multiply? py> x = Decimal(1.3) py> x Decimal('1.3') py> y = 10000000000000000*x py> y - 13000000000000000 Decimal('0.444089209850062616169452667236328125')
Then don't do the obvious thing. Sometimes there really is no good alternative to actually knowing what you are doing. Floating point maths is inherently hard, but that's not the problem. There are all sorts of things in programming which are hard, and people learn how to deal with them. The problem is that people *imagine* that floating point is simple, when it is not and can never be. We don't do them any favours by enabling that delusion. If your needs are light, then you can ignore the complexities of floating point. You really can go a very long way by just rounding the results of your calculations when displaying them. But for anything more than that, we cannot just paper over the floating point complexities without creating new complexities that will burn people. You don't have to become a floating point guru, but it really isn't onerous to expect people who are programming to learn a few basic programming skills, and that includes a few basic coping strategies for floating point. -- Steve

On 02.06.2015 03:37, Steven D'Aprano wrote:
Joonas, I think you're approaching this from the wrong angle. People who want to get an exact decimal from a literal, will use the string representation to define it, not a float representation. In practice, you typically read the data from some file or stream anyway, so it already comes as string value and if you want to convert an actual float to a decimal, this will most likely not be done in a literal way, but instead by passed in to the Decimal constructor as variable, so there's no literal involved. It may be good to provide some alternative ways of converting a float to a decimal, e.g. one which uses the float repr logic to overcome things like repr(float(1.1)) == '1.1000000000000001' instead of a direct conversion:
These could be added as parameter to the Decimal constructor. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 02 2015)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

I think there is another discussion to have here, and that is making Decimal part of the language (__builtin(s)__) vs. part of the library (which implementations can freely omit). If it were part of the language, then maybe, just maybe, a literal syntax should be considered. As it stands, Decimal and Fraction are libraries - implementations of python are free to omit them (as I think some of the embedded platform implementations do), and it currently does not make a lick of sense to add syntax for something that is only in the library. On 6/2/2015 04:19, M.-A. Lemburg wrote:

On 2 June 2015 at 19:38, Alexander Walters <tritium-list@sdamon.com> wrote:
For decimal, the issues that keep it from becoming a literal are similar to those that keep it from becoming a builtin: configurable contexts are a core part of the decimal module's capabilities, and making a builtin type context dependent causes various problems when it comes to reasoning about a piece of code based on purely local information. Those problems affect human readers regardless, but once literals enter the mix, they affect all compile time processing as well. On that front, I also finally found the (mammoth) thread from last year about the idea of using base 10 for floating point values by default: https://mail.python.org/pipermail/python-ideas/2014-March/026436.html One of the things we eventually realised in that thread is that the context dependence problem, while concerning for a builtin type, is an absolute deal breaker for literals, because it means you *can't constant fold them* by calculating the results of expressions at compile time and store the result directly into the code object (https://mail.python.org/pipermail/python-ideas/2014-March/026998.html). This problem is illustrated by asking the following question: What is the result of "Decimal('1.0') + Decimal('1e70')"? Correct answer? Insufficient data (since we don't know the current decimal precision). With the current decimal module, the configurable rounding behaviour is something you just need to learn about as part of adopting the module. Since that configurability is one of the main reasons for using it over binary floating point, that's generally not a big deal. It becomes a much bigger deal when the question being asked is: What is the result of "1.0d + 1e70d"? Those look like they should be numeric constants, and hence the compiler should be able to constant fold them at compile time. That's possible if we were to pick a single IEEE decimal type as a builtin (either decimal64 or decimal128), but not possible if we tried to use the current variable precision decimal type. One of the other "fun" discrepancies introduced by the context sensitive processing in decimals is that unary plus and minus are context-sensitive, which means that any literal format can't express arbitrary negative decimal values without a parser hack to treat the minus sign as part of the trailing literal. This is one of the other main reasons why decimal64 or decimal128 are better candidates for a builtin decimal type than decimal.Decimal as it exists today (as well as being potentially more amenable to hardware acceleration on some platforms). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jun 2, 2015, at 05:44, random832@fastmail.us wrote:
The issue here isn't really binary vs. decimal, but rather that float implements a specific fixed-precision (binary) float type, and Decimal implements a configurable-precision (decimal) float type. As Nick explained elsewhere in that message, decimal64 or decimal128 wouldn't have the context problem. And similarly, a binary.Binary type designed like decimal.Decimal would have the context problem. (This is a slight oversimplification; there's also the fact that Decimal implements the full set of 754-2008 context features, while float implements a subset of 754-1985 features, and even that only if the underlying C lib does so, and nobody ever uses them anyway.)

On Jun 2, 2015, at 05:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
OK, so what are the stumbling blocks to adding decimal32/64/128 (or just one of the three), either in builtins/stdtypes or in decimal, and then adding literals for them? I can imagine a few: someone has to work out exactly what features to support (the same things as float, or everything in the standard?), how it interacts with Decimal and float (which is defined by the standard, but translating that to Python isn't quite trivial), how it fits into the numeric tower ABCs, and what syntax to use for the literals, and if/how it fits into things like array/struct/ctypes and into math, and whether we need decimal complex values, and what the C API looks like (it would be nice if PyDecimal64_AsDecimal64 worked as expected on C11 platforms, but you could still use decimal64 on C90 platforms and just not get such functions...); then write a PEP; then write an implementation; and after all that work, the result may be seen as too much extra complexity (either in the language or in the implementation) for the benefits. But is that it, or is there even more that I'm missing? (Of course while we're at it, it would be nice to have arbitrary-precision IEEE binary floats as well, modeled on the decimal module, and to add all the missing 754-2008/C11 methods/math functions for the existing float type, but those seem like separate proposals from fixed-precision decimal floats.)

On 2 June 2015 at 14:05, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
I would argue that it should be as simple as float. If someone wants the rest of it they've got the Decimal module which is more than enough for their needs.
how it interacts with Decimal and float (which is defined by the standard, but translating that to Python isn't quite trivial),
Interaction between decimalN and Decimal coerces to Decimal. Interaction with floats is a TypeError.
how it fits into the numeric tower ABCs,
Does anyone really use these for anything? I haven't really found them to be very useful since no third-party numeric types use them and they don't really define the kind of information that you might really want in any carefully written numerical algorithm. I don't see any gain in adding any decimal types to e.g Real as the ABCs seem irrelevant to me.
and what syntax to use for the literals, and if/how it fits into things like array/struct/ctypes
It's not essential to incorporate them here. If they become commonly used in C then it would be good to have these for binary compatibility.
and into math, and whether we need decimal complex values,
It's not common to use the math-style functions with the decimal module unless you're using it as a multi-precision library and then you'd really want the full Decimal type. There's no advantage in using decimal for e.g. sin, cos etc. so there's not much really lost in converting to binary and back. It's in the simple arithmetic where it makes a difference so I'd say that decimal should stick to that. As for complex decimals this would only really be worth it if the ultimate plan was to have decimals become the default floating point type. Laura suggested that earlier and I probably agree that it would have been a good idea at some earlier time but it's a bit late for that.
and what the C API looks like (it would be nice if PyDecimal64_AsDecimal64 worked as expected on C11 platforms, but you could still use decimal64 on C90 platforms and just not get such functions...);
Presumably CPython would have to write it's own implementation e.g.: PyDecimal64_FromIntExponentAndLongSignificand ... or something like that.
then write a PEP; then write an implementation; and after all that work, the result may be seen as too much extra complexity (either in the language or in the implementation) for the benefits. But is that it, or is there even more that I'm missing?
I don't think anyone has proposed to add all of the things that you suggested. Of course if there are decimal literals and a fixed-width decimal type then over time people will suggest some of the other things. That doesn't mean that they'd need to be incorporated though. A year ago I said I'd write a PEP for decimal literals but then I got clobbered at work and a number of other things happened so that I didn't even have time to read threads like this. Maybe it's worth revisiting... Oscar

On Jun 2, 2015, at 07:05, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
But decimal64 and Decimal are not the same types. So, if you want to, e.g., get the next decimal64 value after the current value, how would you do that? (Unless you're suggesting there should be a builtin decimal64 and a separate decimal.decimal64 or something, but I don't think you are.) Also, with float, we can get away with saying we're supporting the 1985 standard and common practice among C90 implementations; with decimal64, the justification for arbitrarily implementing part of the 2008 standard but not the rest is not as clear-cut.
how it interacts with Decimal and float (which is defined by the standard, but translating that to Python isn't quite trivial),
Interaction between decimalN and Decimal coerces to Decimal.
Even when the current decimal context is too small to hold a decimalN? Does that raise any flags?
The NumPy native types do. (Of course they also subclass int and float even relevant.)
Even if they are completely irrelevant, unless they're deprecated they pretty much have to be supported by any new types. There might be a good argument that decimal64 doesn't fit into the numeric tower, but you'd have to make that argument.
For ctypes, sure (although even there, ctypes is a relatively simple way to share values between pure-Python child processes with multiprocessing.shared_ctypes). But for array, that's generally not about compatibility with existing C code, it's about efficiently packing zillions of homogenous simple values into as little memory as possible.
Well, math is mostly about double functions from the C90 stdlib, so it's not common to use them with decimal. But that doesn't mean you wouldn't want decimal64 implementations of some of the functions in math.
unless you're using it as a multi-precision library and then you'd really want the full Decimal type.
But again, the full Decimal type isn't just an expansion on decimal64, it's a completely different type, with context-sensitive precision.
There's still rounding error. Sure, usually that won't make a difference--but when it does, it will be surprising and frustrating if you didn't explicitly ask for it.
Why?
Sure, if you want a C API for C90 platforms at all. But you may not even need that. When would you need to write C code that deals with decimal64 values as exponent and significant? Dealing with them as abstract numbers, general Python objects, native decimal64, and maybe even opaque values that I can pass around in C without being able to interpret them, I can see, but what C code needs the exponent and significand?
I think in many (but maybe not all) of these cases the simplest answer is the best, but a PEP would have to actually make that case for each thing.
Maybe we need a PEP for the decimalN type(s) first, then if someone has time and inclination they can write a PEP for literals for those types, either as a companion or as a followup. That would probably cut out 30-50% of the work, and maybe even more of the room for argument and bikeshedding.

On 6/2/2015 9:05 AM, Andrew Barnert via Python-ideas wrote:
A compelling rationale. Python exposes the two basic number types used by the kinds of computers it runs on: integers (extended) and floats (binary in practice, though the language definition would all a decimal float machine). The first killer app for Python was scientific numerical computing. The first numerical package developed for this exposed the entire gamut of integer and float types available in C. Numpy is the third numerical package. (Even so, none of the packages have been distributed with CPython -- and properly so.) Numbers pre-wrapped as dates, times, and datetimes with specialized methods are not essential (Python once managed without) but are enormously useful in a wide variety of application areas. Decimals, another class of pre-wrapped numbers, greatly simplify money calculations, including those that must follow legal or contractual rules. It is no accident that the decimal specification is a product of what was once International Business Machines. Contexts and specialized rounding rules are an essential part of fulfilling the purpose of the module. What application area would be opened up by adding a fixed-precision float? The only thing I have seen presented is making interactive python act even more* like a generic (decimal) calculator, so that newbies will find python floats less surprising that those of other languages. (Of course, a particular decimal## might not exactly any existing calculator.) *The int division change solved the biggest discrepancy: 1/10 is not .1 instead of 0. Representation changes improved things also. -- Terry Jan Reedy

On Jun 2, 2015, at 02:38, Alexander Walters <tritium-list@sdamon.com> wrote:
I think there is another discussion to have here, and that is making Decimal part of the language (__builtin(s)__) vs. part of the library (which implementations can freely omit).
I don't think there is any such distinction in Python. Neither the language reference nor the library reference claims to be a specification. The library documentation specifically says that it "describes the standard library that is distributed with Python" and "also describes some of the optional components that are commonly included in Python distributions", which implies that, except for the handful of modules that are described as optional or platform-specific, everything should always be there. (There is special dispensation for Unix systems to split up Python into separate packages, but even that is specifically limited to "some or all of the optional components".) Historically, implementations that haven't included the entire stdlib also haven't included parts of the language (Jython 2.2 and early 2.5 versions, early versions of PyPy, the various browser-based implementations, MicroPython and PyMite, etc.). Also, both the builtins module and the actual built-in functions, constants, types, and exceptions it contains are documented as part of the library, just like decimal, not as part of the language. So, Python isn't like C, with separate specifications for "freestanding" vs. "hosted" implementations, and it doesn't have a separate specification for an "embedded" subset like C++ used to.
If it were part of the language, then maybe, just maybe, a literal syntax should be considered.
Since there is no such distinction between language and library, I think we're free to define a literal syntax for decimals and fractions. From a practical point of view (which beats purity, of course), it's probably not reasonable for CPython to define such literals unless there's a C implementation that defines the numeric type slot (and maybe even has a C API concrete type interface, although maybe not), and which can be "frozen" at build time. (See past discussions on adding an OrderedDict literal for why these things are important.) That's currently true for Decimal, but not for Fraction. So, that might be an argument against fraction literals, or for providing a C implementation of the fraction module.
As it stands, Decimal and Fraction are libraries - implementations of python are free to omit them (as I think some of the embedded platform implementations do), and it currently does not make a lick of sense to add syntax for something that is only in the library.
Even besides the four library sections on the various kinds of built-in things, plenty of other things are syntax for something that's "only in the library". The import statement is defined in terms of functionality in importlib, and (at least in CPython) actually implemented that way. In fact, numeric values, as defined in the data model section of the language reference, are defined in terms of types from the library docs, both in stdtypes and in the numbers module. Defining decimal values in terms of types defined in the decimal module library section would be no different. (Numeric _literals_ don't seem to have their semantics defined anywhere, just their syntax, but it's pretty obvious from the wording that they're intended to have int, float, and complex values as defined by the data model--which, again, means as defined by the library.) So, while there are potentially compelling arguments against a decimal literal (how it interacts with contexts may be confusing, the idea may be too bikesheddable to come up with one true design that everyone will like, or may be an attractive nuisance, it may add too much complexity to the implementation for the benefit, etc.), "decimal is only a library" doesn't seem to be one.

Joonas Liik writes:
Having some sort of decimal literal would have some advantages of its own, for one it could help against this sillyness:
That *would* be a different type from float. You may as well go all the way to Decimal.
..but this is one really common pitfall for new users,
To fix it, you really need to change the parser, i.e., make Decimal the default type for non-integral numbers. "Decimal('1.3')" isn't that much harder to remember than "1.3$" (although it's quite a bit more to type). But people are going to continue writing things like pennies = 13 pennies_per_dollar = 100 dollars = pennies / pennies_per_dollar # Much later ... future_value = dollars * Decimal('1.07') And in real applications you're going to be using Decimal in code like def inputDecimals(file): for row, line in enumerate(file): for col, value in enumerate(line.strip().split()): matrix[row][col] = Decimal(value) or def what_if(): principal = Decimal(input("Principal ($): ")) rate = Decimal(input("Interest rate (%): ")) print("Future value is ", principal * (1 + rate/100), ".", sep="") and the whole issue evaporates.

On Mon, Jun 1, 2015 at 9:21 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Shudder indeed.
You may as well go all the way to Decimal.
Or perhaps switch to decimal64 ( http://en.wikipedia.org/wiki/Decimal64_floating-point_format)? (Or its bigger cousing, decimal128) -- --Guido van Rossum (python.org/~guido)

On Tue, Jun 2, 2015, at 00:31, Guido van Rossum wrote:
Does anyone know if any common computer architectures have any hardware support for this? Are there any known good implementations for all the functions in math/cmath for these types? Moving to a fixed-size floating point type does have the advantage of not requiring making all these decisions about environments and precision and potentially unbounded growth etc.

On Jun 1, 2015, at 21:47, random832@fastmail.us wrote:
IBM's RS/POWER architecture supports decimal32, 64, and 128. The PowerPC and Cell offshoots only support them in some models, not all. Is that common enough? (Is _anything_ common enough besides x86, x86_64, ARM7, ARM8, and various less-capable things like embedded 68k variants?)
Are there any known good implementations for all the functions in math/cmath for these types?
Intel wrote a reference implementation for IEEE 754-2008 as part of the standardization process. And since then, they've focused on improvements geared at making it possible to write highly-optimized financial applications in C or C++ that run on x86_64 hardware. And I think it's BSD-licensed. It's available somewhere on netlib, but searching that repo is no fun on my phone (plus, most of Intel's code, you can't see the license or the detailed README until you unpack it...), so I'll leave it to someone else to find it. Of course 754-2008 isn't necessarily identical to GDAS (which is what POWER implements, and Python's decimal module).

Why I see the interest, does it really belong in core Python ? What would be the advantages ? IIRC (during | after) the language submit at PyCon this year, it was said that maybe the stdlib should get less features, not more. Side note, Sympy as a IPython ast-hook that will wrap all your integers into SymPy Integers and hence give you rationals of whatever you like, if you want to SymPy-plify your life. But for majority of use will it be useful ? What would be the performance costs ? If you start into stroring rationals, then why not continued fraction, as they are just a N-tuple, instead of 2-tuples. but then you are limited to non-infinite continued fraction. So you improve by using generator... I love Python for doing science and math, but please stay away from putting too much in standard lib, or we will end up with cholesky matrix decomposition in Python 4.0 like Julia does… and I’m not sure it is a good idea. I would much rather have a core set of library “blessed” by CPython that provide features like this one, that are deemed “important”. — M

On Sun, May 31, 2015 at 11:46 PM, Matthias Bussonnier <bussonniermatthias@gmail.com> wrote:
IIRC (during | after) the language submit at PyCon this year, it was said that maybe the stdlib should get less features, not more.
Rationals (and Decimals) already exist in the standard library. The original proposal (as I read it, anyway) is more about the default interpretation of, e.g., integer division and decimal-number literals.
Side note, Sympy as a IPython ast-hook that will wrap all your integers into SymPy Integers and hence give you rationals of whatever you like, if you want to SymPy-plify your life.
Thank you for the pointer -- that's really cool.
But for majority of use will it be useful ?
I believe interpreting "0.1" as 1/10 is more ergonomic than representing it as 1.600000023841858 * (2^-4). I see it as being more useful -- a better fit -- in most use cases because it's simpler, more precise, and more understandable.
What would be the performance costs ?
I don't know. Personally, I'd be willing to pay a performance penalty to avoid reasoning about floating-point arithmetic most of the time, then "drop into" floats when I need the speed.

I don’t know. Personally, I’d be willing to pay a performance penalty to avoid reasoning about floating-point arithmetic most of the time, then “drop into” floats when I need the speed. This is perhaps a bit off topic for the thread, but +9000 for this. Having decimal literals or something similar by default, though perhaps problematic from a backwards compatibility standpoint, is a) user friendly, b) easily understandable, and c) not surprising to beginners. None of these qualities apply to float literals. I always assumed that float literals were mostly an artifact of history or of some performance limitations. Free of those, why would a language choose them over decimal literals? When does someone ever expect floating-point madness, unless they are doing something that is almost certainly not common, or unless they have been burned in the past? Every day another programmer gets bitten by floating point stupidities like this one <http://stackoverflow.com/q/588004/877069>. It would be a big win to kill this lame “programmer rite of passage” and give people numbers that work more like how they learned them in school. The competing proposal is to treat decimal literals as decimal.Decimal values. I’m interested in learning more about such a proposal. Nick On Mon, Jun 1, 2015 at 2:03 AM Jim Witschey <jim.witschey@gmail.com> wrote:

On 1 June 2015 at 16:27, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
In a world of binary computers, no programming language is free of those constraints - if you choose decimal literals as your default, you take a *big* performance hit, because computers are designed as binary systems. (Some languages, like IBM's REXX, do choose to use decimal integers by default) For CPython, we offer C-accelerated decimal support by default since 3.3 (available as pip install cdecimal in Python 2), but it still comes at a high cost in speed: $ python3 -m timeit -s "n = 1.0; d = 3.0" "n / d" 10000000 loops, best of 3: 0.0382 usec per loop $ python3 -m timeit -s "from decimal import Decimal as D; n = D(1); d = D(3)" "n / d" 10000000 loops, best of 3: 0.129 usec per loop And this isn't even like the situation with integers, where the semantics of long integers are such that native integers can be used transparently as an optimisation technique - IEEE754 (which defines the behaviour of native binary floats) and the General Decimal Arithmetic Specification (which defines the behaviour of the decimal module) are genuinely different ways of doing floating point arithmetic, since the choice of base 2 or base 10 has far reaching ramifications for the way various operations work and how various errors accumulate. We aren't even likely to see widespread proliferation of hardware level decimal arithmetic units, because the "binary arithmetic is easier to implement than decimal arithmetic" consideration extends down to the hardware layer as well - a decimal arithmetic unit takes more silicon, and hence more power, than a similarly capable binary unit. With battery conscious mobile device design and environmentally conscious data centre design being two of the most notable current trends in CPU design, this makes it harder than ever to justify providing hardware support for both in general purpose computing devices. For some use cases (e.g. financial math), it's worth paying the price in speed to get the base 10 arithmetic semantics, or the cost in hardware to accelerate it, but for most other situations, we end up being better off teaching humans to cope with the fact that binary logic is the native language of our computational machines. Binary vs decimal floating point is a lot like the Unicode bytes/text distinction in that regard: while Unicode is a better model for representing human communications, there's no avoiding the fact that that text eventually has to be rendered as a bitstream in order to be saved or transmitted. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Decimal literals are far from as obvious as suggested. We *have* the `decimal` module after all, and it defines all sorts of parameters on precision, rounding rules, etc. that one can provide context for. decimal.ROUND_HALF_DOWN is "the obvious way" for some users, while decimal.ROUND_CEILING is "the obvious way" for others. I like decimals, but they don't simply make all the mathematical answers result in what all users would would consider "do what I mean" either. On Sun, May 31, 2015 at 11:27 PM, Nicholas Chammas < nicholas.chammas@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

I'm sorry.. what i meant was not a literal that results in a Decimal, what i meant was a special literal proxy object that usualyl acts like a float except you can ask for its original string form. eg: flit = 1.3 flit*3 == float(flit)*3 str(flit) == '1.3' thus in cases where the intermediate float conversion loses precision you can get at the original string that the programmer actually typed in. Decimal constructors are one case that woudl probably like to use the original string whenever possible to avoid conversion losses, but by no means are they the only ones.

On Jun 1, 2015, at 08:12, Joonas Liik <liik.joonas@gmail.com> wrote:
I'm sorry..
what i meant was not a literal that results in a Decimal, what i meant was a special literal proxy object that usualyl acts like a float except you can ask for its original string form.
This is essentially what I was saying with new "literal constant" types. Swift is probably the most prominent language with this feature. http://nshipster.com/swift-literal-convertible/ is a good description of how it works. Many of the reasons Swift needed this don't apply in Python. For example, in Swift, it's how you can build a Set at compile time from an ArrayLiteral instead of building an Array and converting it to Set at compile time. Or how you can use 0 as a default value for a non-integer type without getting a TypeError or a runtime conversion. Or how you can build an Optional that acts like a real ADT but assign it nil instead of a special enumeration value. Or how you can decode UTF-8 source text to store in UTF-16 or UTF-32 or grapheme-cluster at compile time. And so on.

Sorry, I accidentally sent that before it was done... Sent from my iPhone
Anyway, my point was that the Swift feature is complicated, and has some controversial downsides (e.g., see the example at the end of silently using a string literal as if it were a URL by accessing an attribute of the NSURL class--which works given the Smalltalk-derived style of OO, but many people still find it confusing). But the basic idea can be extracted out and Pythonified: The literal 1.23 no longer gives you a float, but a FloatLiteral, which is either a subclass of float, or an unrelated class that has a __float__ method. Doing any calculation on it gives you a float. But as long as you leave it alone as a FloatLiteral, it has its literal characters available for any function that wants to distinguish FloatLiteral from float, like the Decimal constructor. The problem that Python faces that Swift doesn't is that Python doesn't use static typing and implicit compile-time conversions. So in Python, you'd be passing around these larger values and doing the slow conversions at runtime. That may or may not be unacceptable; without actually building it and testing some realistic programs it's pretty hard to guess. The advantage of C++-style user-defined literal suffixes is that the absence of a suffix is something the compiler can see, so 1.23d might still require a runtime call, but 1.23 just is compiled as a float constant the same as it's been since Python 1.x.

On 2 Jun 2015 08:44, "Andrew Barnert via Python-ideas" <python-ideas@python.org> wrote:
Joonas's suggestion of storing the original text representation passed to the float constructor is at least a novel one - it's only the idea of actual decimal literals that was ruled out in the past. Aside from the practical implementation question, the main concern I have with it is that we'd be trading the status quo for a situation where "Decimal(1.3)" and "Decimal(13/10)" gave different answers. It seems to me that a potentially better option might be to adjust the implicit float->Decimal conversion in the Decimal constructor to use the same algorithm as we now use for float.__repr__ [1], where we look for the shortest decimal representation that gives the same answer when rendered as a float. At the moment you have to indirect through str() or repr() to get that behaviour:
Cheers, Nick. [1] http://bugs.python.org/issue1580

On Jun 1, 2015, at 17:08, Nick Coghlan <ncoghlan@gmail.com> wrote:
I actually built about half an implementation of something like Swift's LiteralConvertible protocol back when I was teaching myself Swift. But I think I have a simpler version that I could implement much more easily. Basically, FloatLiteral is just a subclass of float whose __new__ stores its constructor argument. Then decimal.Decimal checks for that stored string and uses it instead of the float value if present. Then there's an import hook that replaces every Num with a call to FloatLiteral. This design doesn't actually fix everything; in effect, 1.3 actually compiles to FloatLiteral(str(float('1.3')) (because by the time you get to the AST it's too late to avoid that first conversion). Which does actually solve the problem with 1.3, but doesn't solve everything in general (e.g., just feed in a number that has more precision than a double can hold but less than your current decimal context can...). But it just lets you test whether the implementation makes sense and what the performance effects are, and it's only an hour of work, and doesn't require anyone to patch their interpreter to play with it. If it seems promising, then hacking the compiler so 2.3 compiles to FloatLiteral('2.3') may be worth doing for a test of the actual functionality. I'll be glad to hack it up when I get a chance tonight. But personally, I think decimal literals are a better way to go here. Decimal(1.20) magically doing what you want still has all the same downsides as 1.20d (or implicit decimal literals), plus it's more complex, adds performance costs, and doesn't provide nearly as much benefit. (Yes, Decimal(1.20) is a little nicer than Decimal('1.20'), but only a little--and nowhere near as nice as 1.20d).
Yes, to solve that you really need Decimal(13)/Decimal(10)... Which implies that maybe the simplification in Decimal(1.3) is more misleading than helpful. (Notice that this problem also doesn't arise for decimal literals--13/10d is int vs. Decimal division, which is correct out of the box. Or, if you want prefixes, d13/10 is Decimal vs. int division.)

On Jun 1, 2015, at 18:27, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Make that 15 minutes. https://github.com/abarnert/floatliteralhack

On Jun 1, 2015, at 19:00, Andrew Barnert <abarnert@yahoo.com> wrote:
And as it turns out, hacking the tokens is no harder than hacking the AST (in fact, it's a little easier; I'd just never done it before), so now it does that, meaning you really get the actual literal string from the source, not the repr of the float of that string literal. Turning this into a real implementation would obviously be more than half an hour's work, but not more than a day or two. Again, I don't think anyone would actually want this, but now people who think they do have an implementation to play with to prove me wrong.

On Tue, Jun 02, 2015 at 10:08:37AM +1000, Nick Coghlan wrote:
Apart from the questions of whether such a change would be allowed by the Decimal specification, and the breaking of backwards compatibility, I would really hate that change for another reason. At the moment, a good, cheap way to find out what a binary float "really is" (in some sense) is to convert it to Decimal and see what you get: Decimal(1.3) -> Decimal('1.3000000000000000444089209850062616169452667236328125') If you want conversion from repr, then you can be explicit about it: Decimal(repr(1.3)) -> Decimal('1.3') ("Explicit is better than implicit", as they say...) Although in fairness I suppose that if this change happens, we could keep the old behaviour in the from_float method: # hypothetical future behaviour Decimal(1.3) -> Decimal('1.3') Decimal.from_float(1.3) -> Decimal('1.3000000000000000444089209850062616169452667236328125') But all things considered, I don't think we're doing people any favours by changing the behaviour of float->Decimal conversions to implicitly use the repr() instead of being exact. I expect this strategy is like trying to flatten a bubble under wallpaper: all you can do is push the gotchas and surprises to somewhere else. Oh, another thought... Decimals could gain yet another conversion method, one which implicitly uses the float repr, but signals if it was an inexact conversion or not. Explicitly calling repr can never signal, since the conversion occurs outside of the Decimal constructor and Decimal sees only the string: Decimal(repr(1.3)) cannot signal Inexact. But: Decimal.from_nearest_float(1.5) # exact Decimal.from_nearest_float(1.3) # signals Inexact That might be useful, but probably not to beginners. -- Steve

On Jun 1, 2015, at 18:58, Steven D'Aprano <steve@pearwood.info> wrote:
As far as I know, GDAS doesn't specify anything about implicit conversion from floats. As long as the required explicit conversion function (which I think is from_float?) exists and does the required thing. As a side note, has anyone considered whether it's worth switching to IEEE-754-2008 as the controlling specification? There may be a good reason not to do so; I'm just curious whether someone has thought it through and made the case.
I think this might be worth having whether the default constructor is changed or not. I can't think of too many programs where I'm pretty sure I have an exactly-representable decimal as a float but want to check to be sure... but for interactive use in IPython (especially when I'm specifically trying to explain to someone why just using Decimal instead of float will/will not solve their problem) I could see using it.

On 2 June 2015 at 13:10, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
As far as I know, nobody has looked into it. If there aren't any meaningful differences, we should just switch, if there are differences, we should probably switch anyway, but it will be more work (and hence will require volunteers willing to do that work). Either way, the starting point would be an assessment of what the differences are, and whether or not they have any implications for the decimal module and cdecimal. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Monday, June 1, 2015 10:23 PM, "random832@fastmail.us" <random832@fastmail.us> wrote:
Does IEEE even have anything about arbitrary-precision decimal types
(which are what decimal/cdecimal are)?
Yes. When many people say "IEEE float" they still mean 754-1985. This is what C90 was designed to "support without quite supporting", and what C99 explicitly supports, and what many consumer FPUs support (or, in the case of the 8087 and its successors, a preliminary version of the 1985 standard). That standard did not cover either arbitrary precision or decimals; both of those were only part of the companion standard 854 (which isn't complete enough to base an implementation on). But the current version of the standard, 754-2008, does cover arbitrary-precision decimal types. If I understand the relationship between the standards: 754-2008 was designed to merge 754-1985 and 854-1987, fill in the gaps, and fix any bugs; GDAS was a major influence (the committee chair was GDAS's author); and since 2009 GDAS has gone from being a de facto independent standard to being a more-specific specification of the relevant subset of 754-2008. IBM's hardware and Java library implement GDAS (and therefore implicitly the relevant part of 754-2008); Itanium (partly), C11, the gcc extensions, and Intel's C library implement 754-2008 (or IEC 60559, which is just a republished 754-2008). So, my guess is that GDAS makes perfect sense to follow unless Python wants to expose C11's native fixed decimals, or the newer math.h functions from C99/C11/C14, or the other parts of 754-2008 that it doesn't support (like arbitrary-precision binary). My question was just whether someone had actually made that decision, or whether decimal is following GDAS just because that was the obvious decision to make in 2003.

On 02.06.2015 08:40, Andrew Barnert via Python-ideas wrote:
The IBM decimal implementation by Mike Cowlishaw was chosen as basis for the Python's decimal implementation at the time, so yes, this was an explicit design choice at the time: http://legacy.python.org/dev/peps/pep-0327/ http://speleotrove.com/decimal/ According to the PEP, decimal implements IEEE 854-1987 (with some restrictions). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 02 2015)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

Andrew Barnert wrote:
How about more general Decimal.from_exact that does the same for argument of any type – float, int, Decimal object with possibly different precission, fraction, string. Just convert the argument to Decimal and signal if it cannot be done losslessly. The same constructor with the same semantics could be added to int, float, Fraction as well. Regards, Drekin

Stephen J. Turnbull writes:
What if also 13/10 yielded a fraction? Anyway, what are the objections to integer division returning a fraction? They are coerced to floats when mixed with them. Also, the repr of Fraction class could be altered so repr(13 / 10) == "13 / 10" would hold. Regards, Drekin

On Jun 3, 2015, at 07:29, drekin@gmail.com wrote:
That was raised near the start of the thread. In fact, I think the initial proposal was that 13/10 evaluated to Fraction(13, 10) and 1.2 evaluated to something like Fraction(12, 10).
Anyway, what are the objections to integer division returning a fraction? They are coerced to floats when mixed with them.
As mentioned earlier in the thread, the language that inspired Python, ABC, used exactly this design: computations were kept as exact rationals until you mixed them with floats or called irrational functions like root. So it's not likely Guido didn't think of this possibility; he deliberately chose not to do things this way. He even wrote about this a few years ago; search for "integer division" on his Python-history blog. So, what are the problems? When you stay with exact rationals through a long series of computations, the result can grow to be huge in memory, and processing time. (I'm ignoring the fact that CPython doesn't even have a fast fraction implementation, because one could be added easily. It's still going to be orders of magnitude slower to add two fractions with gigantic denominators than to add the equivalent floats or decimals.) Plus, it's not always obvious when you've lost exactness. For example, exponentiation between rationals is exact only if the power simplifies to a whole fraction (and hasn't itself become a float somewhere along the way). Since the fractions module doesn't have IEEE-style flags for inexactness/rounding, it's harder to notice when this happens. Except in very trivial cases, the repr would be much less human-readable and -debuggable, not more. (Or do you find 1728829813 / 2317409 easier to understand than 7460.181958816937?) Fractions and Decimals can't be mixed or interconverted directly. There are definitely cases where a rational type is the right thing to use (it wouldn't be in the stdlib otherwise), but I think they're less common than the cases where a floating-point type (whether binary or decimal) is the right thing to use. (And even many cases where you think you want rationals, what you actually want is SymPy-style symbolic computation--which can give you exact results for things with roots or sins or whatever as long as they cancel out in the end.)

On 2 Jun 2015 01:04, "David Mertz" <mertz@gnosis.cx> wrote:
Decimal literals are far from as obvious as suggested. We *have* the
`decimal` module after all, and it defines all sorts of parameters on precision, rounding rules, etc. that one can provide context for. decimal.ROUND_HALF_DOWN is "the obvious way" for some users, while decimal.ROUND_CEILING is "the obvious way" for others.
I like decimals, but they don't simply make all the mathematical answers
result in what all users would would consider "do what I mean" either. The last time we had a serious discussion about decimal literals, we realised the fact their behaviour is context dependent posed a significant problem for providing a literal form. With largely hardware provided IEEE754 semantics, binary floats are predictable, albeit somewhat surprising if you're expecting abstract math behaviour (i.e. no rounding errors), or finite base 10 representation behaviour. By contrast, decimal arithmetic deliberately allows for configurable contexts, presumably because financial regulations sometimes place strict constraints on how arithmetic is to be handled (e.g. "round half even" is also known as "banker's rounding", since it eliminates statistical bias in rounding financial transactions to the smallest supported unit of currency). That configurability makes decimal more fit for its primary intended use case (i.e. financial math), but also makes local reasoning harder - the results of some operations (even something as simple as unary plus) may vary based on the configured context (the precision, in particular). Cheers, Nick.

On Mon, Jun 01, 2015 at 06:27:57AM +0000, Nicholas Chammas wrote:
I wish this myth about Decimals would die, because it isn't true. The only advantage of base-10 floats over base-2 floats -- and I'll admit it can be a big advantage -- is that many of the numbers we commonly care about can be represented in Decimal exactly, but not as base-2 floats. In every other way, Decimals are no more user friendly, understandable, or unsurprising than floats. Decimals violate all the same rules of arithmetic that floats do. This should not come as a surprise, since decimals *are* floats, they merely use base 10 rather than base 2. In the past, I've found that people are very resistant to this fact, so I'm going to show a few examples of how Decimals violate the fundamental laws of mathematics just as floats do. For those who already know this, please forgive me belabouring the obvious. In mathematics, adding anything other than zero to a number must give you a different number. Decimals violate that expectation just as readily as binary floats: py> from decimal import Decimal as D py> x = D(10)**30 py> x == x + 100 # should be False True Apart from zero, multiplying a number by its inverse should always give one. Again, violated by decimals: py> one_third = 1/D(3) py> 3*one_third == 1 False Inverting a number twice should give the original number back: py> 1/(1/D(7)) == 7 False Here's a violation of the Associativity Law, which states that (a+b)+c should equal a+(b+c) for any values a, b, c: py> a = D(1)/17 py> b = D(5)/7 py> c = D(12)/13 py> (a + b) + c == a + (b+c) False (For the record, it only took me two attempts, and a total of about 30 seconds, to find that example, so it's not particularly difficult to come across such violations.) Here's a violation of the Distributive Law, which states that a*(b+c) should equal a*b + a*c: py> a = D(15)/2 py> b = D(15)/8 py> c = D(1)/14 py> a*(b+c) == a*b + a*c False (I'll admit that was a bit trickier to find.) This one is a bit subtle, and to make it easier to see what is going on I will reduce the number of digits used. When you take the average of two numbers x and y, mathematically the average must fall *between* x and y. With base-2 floats, we can't guarantee that the average will be strictly between x and y, but we can be sure that it will be either between the two values, or equal to one of them. But base-10 Decimal floats cannot even guarantee that. Sometimes the calculated average falls completely outside of the inputs. py> from decimal import getcontext py> getcontext().prec = 3 py> x = D('0.516') py> y = D('0.518') py> (x+y)/2 # should be 0.517 Decimal('0.515') This one is even worse: py> getcontext().prec = 1 py> x = D('51.6') py> y = D('51.8') py> (x+y)/2 # should be 51.7 Decimal('5E+1') Instead of the correct answer of 51.7, Decimal calculates the answer as 50 exactly.
Performance and accuracy will always be better for binary floats. Binary floats are faster, and have stronger error bounds and slower-growing errors. Decimal floats suffer from the same problems as binary floats, only more so, and are slower to boot.
There's a lot wrong with that. - The sorts of errors we see with floats are not "madness", but the completely logical consequences of what happens when you try to do arithmetic in anything less than the full mathematical abstraction. - And they aren't rare either -- they're incredibly common. Fortunately, most of the time they don't matter, or aren't obvious, or both. - Decimals don't behave like the numbers you learn in school either. Floats are not real numbers, regardless of which base you use. And in fact, the smaller the base, the smaller the errors. Binary floats are better than decimals in this regard. (Decimals *only* win out due to human bias: we don't care too much that 1/7 cannot be expressed exactly as a float using *either* binary or decimal, but we do care about 1/10. And we conveniently ignore the case of 1/3, because familiarity breeds contempt.) - Being at least vaguely aware of floating point issues shouldn't be difficult for anyone who has used a pocket calculator. And yet every day brings in another programmer surprised by floats. - It's not really a rite of passage, that implies that it is arbitrary and imposed culturally. Float issues aren't arbitrary, they are baked into the very nature of the universe. You cannot hope to perform infinitely precise real-number arithmetic using just a finite number of bits of storage, no matter what system you use. Fixed-point maths has its own problems, as does rational maths. All you can do is choose to shift the errors from some calculations to other calculations, you cannot eliminate them altogether. -- Steve

On 1 June 2015 at 15:58, Steven D'Aprano <steve@pearwood.info> wrote:
There is one other "advantage" to decimals - they behave like electronic calculators (which typically used decimal arithmetic). This is a variation of "human bias" - we (if we're of a certain age, maybe today's youngsters are less used to the vagaries of electronic calculators :-)) are used to seeing 1/3 displayed as 0.33333333, and showing that 1/3*3 = 0.99999999 was a "fun calculator fact" when I was at school. Paul

Well, I learned a lot about decimals today. :) On Mon, Jun 1, 2015 at 3:08 AM, Nick Coghlan ncoghlan@gmail.com <http://mailto:ncoghlan@gmail.com> wrote: In a world of binary computers, no programming language is free of those constraints - if you choose decimal literals as your default, you take a *big* performance hit, because computers are designed as binary systems. (Some languages, like IBM’s REXX, do choose to use decimal integers by default) I guess it’s a non-trivial tradeoff. But I would lean towards considering people likely to be affected by the performance hit as doing something “not common”. Like, if they are doing that many calculations that it matters, perhaps it makes sense to ask them to explicitly ask for floats vs. decimals, in exchange for giving the majority who wouldn’t notice a performance difference a better user experience. On Mon, Jun 1, 2015 at 10:58 AM, Steven D’Aprano steve@pearwood.info <http://mailto:steve@pearwood.info> wrote: I wish this myth about Decimals would die, because it isn’t true. Your email had a lot of interesting information about decimals that would make a good blog post, actually. Writing one up will perhaps help kill this myth in the long run :) In the past, I’ve found that people are very resistant to this fact, so I’m going to show a few examples of how Decimals violate the fundamental laws of mathematics just as floats do. How many of your examples are inherent limitations of decimals vs. problems that can be improved upon? Admittedly, the only place where I’ve played with decimals extensively is on Microsoft’s SQL Server (where they are the default literal <https://msdn.microsoft.com/en-us/library/ms179899.aspx>). I’ve stumbled in the past on my own decimal gotchas <http://dba.stackexchange.com/q/18997/2660>, but looking at your examples and trying them on SQL Server I suspect that most of the problems you show are problems of precision and scale. Perhaps Python needs better rules for how precision and scale are affected by calculations (here are SQL Server’s <https://msdn.microsoft.com/en-us/library/ms190476.aspx>, for example), or better defaults when they are not specified? Anyway, here’s what happens on SQL Server for some of the examples you provided. Adding 100: py> from decimal import Decimal as D py> x = D(10)**30 py> x == x + 100 # should be False True DECLARE @x DECIMAL(38,0) = '1' + REPLICATE(0, 30); IF @x = @x + 100 SELECT 'equal' AS adding_100ELSE SELECT 'not equal' AS adding_100 Gives “not equal” <http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1645/0>. Leaving out the precision when declaring @x (i.e. going with the default precision of 18 <https://msdn.microsoft.com/en-us/library/ms187746.aspx>) immediately yields an understandable data truncation error. Associativity: py> a = D(1)/17 py> b = D(5)/7 py> c = D(12)/13 py> (a + b) + c == a + (b+c) False DECLARE @a DECIMAL = 1.0/17;DECLARE @b DECIMAL = 5.0/7;DECLARE @c DECIMAL = 12.0/13; IF (@a + @b) + @c = @a + (@b + @c) SELECT 'equal' AS associativeELSE SELECT 'not equal' AS associative Gives “equal” <http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1656/0>. Distributivity: py> a = D(15)/2 py> b = D(15)/8 py> c = D(1)/14 py> a*(b+c) == a*b + a*c False DECLARE @a DECIMAL = 15.0/2;DECLARE @b DECIMAL = 15.0/8;DECLARE @c DECIMAL = 1.0/14; IF @a * (@b + @c) = @a*@b + @a*@c SELECT 'equal' AS distributiveELSE SELECT 'not equal' AS distributive Gives “equal” <http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1655/0>. I think some of the other decimal examples you provide, though definitely not 100% beginner friendly, are still way more human-friendly because they are explainable in terms of precision and scale, which we can understand more simply (“there aren’t enough decimal places to carry the result”) and which have parallels in other areas of life as Paul pointed out. - The sorts of errors we see with floats are not “madness”, but the completely logical consequences of what happens when you try to do arithmetic in anything less than the full mathematical abstraction. I don’t mean madness as in incorrect, I mean madness as in difficult to predict and difficult to understand. Your examples do show that it isn’t all roses and honey with decimals, but do you find it easier to understand explain all the weirdness of floats vs. decimals? Understanding float weirdness (and disclaimer: I don’t) seems to require understanding some hairy stuff, and even then it is not predictable because there are platform dependent issues. Understanding decimal “weirdness” seems to require only understanding precision and scale, and after that it is mostly predictable. Nick On Mon, Jun 1, 2015 at 11:19 AM Paul Moore <p.f.moore@gmail.com> wrote: On 1 June 2015 at 15:58, Steven D'Aprano <steve@pearwood.info> wrote:

On Jun 1, 2015, at 10:24, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
Obviously if you know the maximum precision needed before you start and explicitly set it to something big enough (or 7 places bigger than needed) you won't have any problem. Steven chose a low precision just to make the problems easy to see and understand; he could just as easily have constructed examples for a precision of 18. Unfortunately, even in cases where it is both possible and sufficiently efficient to work out and set the precision high enough to make all of your calculations exact, that's not something most people know how to do reliably. In the fully general case, it's as hard as calculating error propagation. As for the error: Python's decimal flags that too; the difference is that the flag is ignored by default. You can change it to warn or error instead. Maybe the solution is to make that easier--possibly just changing the docs. If you read the whole thing you will eventually learn that the default context ignores most such errors, but a one-liner gets you a different context that acts like SQL Server, but who reads the whole module docs (especially when they already believe they understand how decimal arithmetic works)? Maybe moving that up near the top would be useful?

On Mon, Jun 1, 2015 at 6:15 PM Andrew Barnert abarnert@yahoo.com <http://mailto:abarnert@yahoo.com> wrote: Obviously if you know the maximum precision needed before you start and
Perhaps Python needs better rules for how precision and scale are affected by calculations (here are SQL Server’s <https://msdn.microsoft.com/en-us/library/ms190476.aspx>, for example), or better defaults when they are not specified? It sounds like there are perhaps several improvements that can be made to how decimals are handled, documented, and configured by default, that could possibly address the majority of gotchas for the majority of people in a more user friendly way than can be accomplished with floats. For all the problems presented with decimals by Steven and others, I’m not seeing how overall they are supposed to be *worse* than the problems with floats. We can explain precision and scale to people when they are using decimals and give them a basic framework for understanding how they affect calculations, and we can pick sensible defaults so that people won’t hit nasty gotchas easily. So we have some leverage there for making the experience better for most people most of the time. What’s our leverage for improving the experience of working with floats? And is the result really something better than decimals? Nick

On Jun 1, 2015, at 15:53, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
I definitely agree that some edits to the decimal module docs, plus maybe a new HOWTO, and maybe some links to outside resources that explain things to people who are used to decimals in MSSQLServer or REXX or whatever, would be helpful. The question is, who has the sufficient knowledge, skill, and time/inclination to do it?
It sounds like there are perhaps several improvements that can be made to how decimals are handled, documented, and configured by default, that could possibly address the majority of gotchas for the majority of people in a more user friendly way than can be accomplished with floats.
For all the problems presented with decimals by Steven and others, I’m not seeing how overall they are supposed to be worse than the problems with floats.
They're not worse than the problems with floats, they're the same problems... But the _effect_ of those problems can be worse, because: * The magnitude of the rounding errors is larger. * People mistakenly think they understand everything relevant about decimals, and the naive tests they try work out, so the problems may blindside them. * Being much more detailed and configurable means the best solution may be harder to find. * There's a lot of correct but potentially-misleading information out there. For example, any StackOverflow answer that says "you can solve this particular problem by using Decimal instead of float" can be very easily misinterpreted as applying to a much wider range of problems than it actually does. * Sometimes performance matters. On the other hand, the effect can also be less bad, because: * Once people do finally understand a given problem, at least for many people and many problems, working out a solution is easier in decimal. For some uses (in particular, many financial uses, and some kinds of engineering problems), it's even trivial. * Being more detailed and more configurable means the best solution may be better than any solution involving float. I don't think there's any obvious answer to the tradeoff, short of making it easier for people to choose appropriately: a good HOWTO, decimal literals or Swift-style float-convertibles, making it easier to find/construct decimal64 or DECIMAL(18) or Money types, speeding up decimal (already done, but maybe more could be done), etc.

Nicholas, Your email client appears to not be quoting text you quote. It is a conventional to use a leading > for quoting, perhaps you could configure your mail program to do so? The good ones even have a "Paste As Quote" command. On with the substance of your post... On Mon, Jun 01, 2015 at 01:24:32PM -0400, Nicholas Chammas wrote:
Changing from binary floats to decimal floats by default is a big, backwards incompatible change. Even if it's a good idea, we're constrained by backwards compatibility: I would imagine we wouldn't want to even introduce this feature until the majority of people are using Python 3 rather than Python 2, and then we'd probably want to introduce it using a "from __future__ import decimal_floats" directive. So I would guess this couldn't happen until probably 2020 or so. But we could introduce a decimal literal, say 1.1d for Decimal("1.1"). The first prerequisite is that we have a fast Decimal implementation, which we now have. Next we would have to decide how the decimal literals would interact with the decimal module. Do we include full support of the entire range of decimal features, including globally configurable precision and other modes? Or just a subset? How will these decimals interact with other numeric types, like float and Fraction? At the moment, Decimal isn't even part of the numeric tower. There's a lot of ground to cover, it's not a trivial change, and will definitely need a PEP.
How many of your examples are inherent limitations of decimals vs. problems that can be improved upon?
In one sense, they are inherent limitations of floating point numbers regardless of base. Whether binary, decimal, hexadecimal as used in some IBM computers, or something else, you're going to see the same problems. Only the specific details will vary, e.g. 1/3 cannot be represented exactly in base 2 or base 10, but if you constructed a base 3 float, it would be exact. In another sense, Decimal has a big advantage that it is much more configurable than Python's floats. Decimal lets you configure the precision, rounding mode, error handling and more. That's not inherent to base 10 calculations, you can do exactly the same thing for binary floats too, but Python doesn't offer that feature for floats, only for Decimals. But no matter how you configure Decimal, all you can do is shift the gotchas around. The issue really is inherent to the nature of the problem, and you cannot defeat the universe. Regardless of what base you use, binary or decimal or something else, or how many digits precision, you're still trying to simulate an uncountably infinite continuous, infinitely divisible number line using a finite, discontinuous set of possible values. Something has to give. (For the record, when I say "uncountably infinite", I don't just mean "too many to count", it's a technical term. To oversimplify horribly, it means "larger than infinity" in some sense. It's off-topic for here, but if anyone is interested in learning more, you can email me off-list, or google for "countable vs uncountable infinity".) Basically, you're trying to squeeze an infinite number of real numbers into a finite amount of memory. It can't be done. Consequently, there will *always* be some calculations where the true value simply cannot be calculated and the answer you get is slightly too big or slightly too small. All the other floating point gotchas follow from that simple fact.
No. Change the precision and scale, and some *specific* problems goes away, but they reappear with other numbers. Besides, at the point that you're talking about setting the precision, we're really not talking about making things easy for beginners any more. And not all floating point issues are related to precision and scale in decimal. You cannot divide a cake into exactly three equal pieces in Decimal any more than you can divide a cake into exactly three equal pieces in binary. All you can hope for is to choose a precision were the rounding errors in one part of your calculation will be cancelled by the rounding errors in another part of your calculation. And that precision will be different for any two arbitrary calculations. -- Steve

On Tue, Jun 2, 2015 at 12:58 AM, Steven D'Aprano <steve@pearwood.info> wrote:
To be fair, you've actually destroyed precision so much that your numbers start out effectively equal:
They're not actually showing up as equal, but only because the precision setting doesn't (apparently) apply to the constructor. If adding zero to both sides of an equation makes it equal when it wasn't before, something seriously screwy is going on. (Actually, this behaviour of decimal.Decimal reminds me very much of REXX. Since there are literally no data types in REXX (everything is a string), the numeric precision setting ("NUMERIC DIGITS n") applies only to arithmetic operations, so the same thing of adding zero to both sides can happen.) So what you're really doing here is averaging 5E+1 and 5E+1, with an unsurprising result of... 5E+1. Your other example is more significant here, because your numbers actually do fit inside the precision limits - and then the end result slips outside the bounds. ChrisA

On Mon, Jun 1, 2015, at 10:58, Steven D'Aprano wrote:
But people have been learning about those rules, as apply to decimals, since they were small children. They know intuitively that 2/3 rounds to ...6667 at some point because they've done exactly that by hand. "user friendly" and "understandable to beginners" don't arise in a vacuum.

You are definitely right in "float vs. Decimal as representation of a real", but there is also a syntactical point that interpreting a float literal as Decimal rather than binary float is more natural since the literal itself *is* decimal. The there would be no counterpart of the following situation if the float literal was interpreted as Decimal rather than binary float.
Regards, Drekin

On 6/1/2015 2:02 AM, Jim Witschey wrote:
No, it is an idea presented here and other python lists. Example: just today, Laura Creighton wrote on python-list (Re: What is considered an "advanced" topic in Python?)
There is no PEP AFAIK because no one has bothered to write one sure to be rejected. -- Terry Jan Reedy

On Sun, May 31, 2015, at 22:25, u8y7541 The Awesome Person wrote:
Even though he's mistaken about the core premise, I do think there's a kernel of a good idea here - it would be nice to have a method (maybe as_integer_ratio, maybe with some parameter added, maybe a different method) to return with the smallest denominator that would result in exactly the original float if divided out, rather than merely the smallest power of two.

On Sun, May 31, 2015 at 8:14 PM, <random832@fastmail.us> wrote:
What is the computational complexity of a hypothetical float.as_simplest_integer_ratio() method? How hard that is to find is not obvious to me (probably it should be, but I'm not sure). -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 31May2015 20:27, David Mertz <mertz@gnosis.cx> wrote:
Probably the same as Euler's greatest common factor method. About log(n) I think. Take as_integer_ratio, find greatest common factor, divide both by that. Cheers, Cameron Simpson <cs@zip.com.au> In the desert, you can remember your name, 'cause there ain't no one for to give you no pain. - America

On Mon, Jun 1, 2015, at 00:37, Cameron Simpson wrote:
Er, no, because (6004799503160661, 18014398509481984) are already mutually prime, and we want (1, 3). This is a different problem from finding a reduced fraction. There are algorithms, I know, for constraining the denominator to a specific range (Fraction.limit_denominator does this), but that's not *quite* the same as finding the lowest one that will still convert exactly to the original float

On 01Jun2015 01:11, random832@fastmail.us <random832@fastmail.us> wrote:
Ah, you want the simplest fraction that _also_ gives the same float representation?
Hmm. Thanks for this clarification. Cheers, Cameron Simpson <cs@zip.com.au> The Design View editor of Visual InterDev 6.0 is currently incompatible with Compatibility Mode, and may not function correctly. - George Politis <george@research.canon.com.au>, 22apr1999, quoting http://msdn.microsoft.com/vstudio/technical/ie5.asp

On Sun, May 31, 2015 at 8:27 PM, David Mertz <mertz@gnosis.cx> wrote:
Here is a (barely tested) implementation based on the Stern-Brocot tree: def as_simple_integer_ratio(x): x = abs(float(x)) left = (int(x), 1) right = (1, 0) while True: mediant = (left[0] + right[0], left[1] + right[1]) test = mediant[0] / mediant[1] print(left, right, mediant, test) if test == x: return mediant elif test < x: left = mediant else: right = mediant print(as_simple_integer_ratio(41152/263)) The approximations are printed so you can watch the convergence. casevh

On Sun, May 31, 2015 at 8:14 PM, <random832@fastmail.us> wrote:
The gmpy2 library already supports such a method.
gmpy2 uses a version of the Stern-Brocot algorithm to find the shortest fraction that, when converted to a floating point value, will return the same value as the original floating point value. The implementation was originally done by Alex Martelli; I have just maintained it over the years. The algorithm is quite fast. If there is a consensus to add this method to Python, I would be willing to help implement it. casevh

On Wed, Jun 03, 2015 at 09:08:17AM -0700, drekin@gmail.com wrote:
Guido's time machine strikes again: py> Fraction(0.1).limit_denominator(1000) Fraction(1, 10)
Fraction.simple_from(Decimal(1) / Decimal(3)) Fraction(1, 3)
py> Fraction(Decimal(1)/Decimal(3)).limit_denominator(100) Fraction(1, 3) -- Steve

On May 31, 2015 7:26 PM, "u8y7541 The Awesome Person" < surya.subbarao1@gmail.com> wrote:
Dear Python Developers:
I will be presenting a modification to the float class, which will
improve its speed and accuracy (reduce floating point errors). This is applicable because Python uses a numerator and denominator rather than a sign and mantissa to represent floats. Python's floats are in fact ieee754 floats, using sign/mantissa/exponent, as provided by all popular CPU floating point hardware. This is why you're getting the results you see -- 1/3 cannot be exactly represented as a float, so it gets rounded to the closest representable float, and then as_integer_ratio shows you an exact representation of this rounded value. It sounds like you're instead looking for an exact fraction representation, which in python is available in the standard "fractions" module: https://docs.python.org/3.5/library/fractions.html -n

On Mon, Jun 1, 2015 at 12:25 PM, u8y7541 The Awesome Person <surya.subbarao1@gmail.com> wrote:
I will be presenting a modification to the float class, which will improve its speed and accuracy (reduce floating point errors). This is applicable because Python uses a numerator and denominator rather than a sign and mantissa to represent floats.
First, I propose that a float's integer ratio should be accurate. For example, (1 / 3).as_integer_ratio() should return (1, 3). Instead, it returns(6004799503160661, 18014398509481984).
I think you're misunderstanding the as_integer_ratio method. That isn't how Python works internally; that's a service provided for parsing out float internals into something more readable. What you _actually_ are working with is IEEE 754 binary64. (Caveat: I have no idea what Python-the-language stipulates, nor what other Python implementations use, but that's what CPython uses, and you did your initial experiments with CPython. None of this discussion applies *at all* if a Python implementation doesn't use IEEE 754.) So internally, 1/3 is stored as: 0 <-- sign bit (positive) 01111111101 <-- exponent (1021) 0101010101010101010101010101010101010101010101010101 <-- mantissa (52 bits, repeating) The exponent is offset by 1023, so this means 1.010101.... divided by 2²; the original repeating value is exactly equal to 4/3, so this is correct, but as soon as it's squeezed into a finite-sized mantissa, it gets rounded - in this case, rounded down. That's where your result comes from. It's been rounded such that it fits inside IEEE 754, and then converted back to a fraction afterwards. You're never going to get an exact result for anything with a denominator that isn't a power of two. Fortunately, Python does offer a solution: store your number as a pair of integers, rather than as a packed floating point value, and all calculations truly will be exact (at the cost of performance):
This is possibly more what you want to work with. ChrisA

Teachable moments about the implementation of floating-point aside, something in this neighborhood has been considered and rejected before, in PEP 240. However, that was in 2001 - it was apparently created the same day as PEP 237, which introduced transparent conversion of machine ints to bignums in the int type. I think hiding hardware number implementations has been a success for integers - it's a far superior API. It could be for rationals as well. Has something like this thread's original proposal - interpeting decimal-number literals as fractional values and using fractions as the result of integer arithmetic - been seriously discussed more recently than PEP 240? If so, why haven't they been implemented? Perhaps enough has changed that it's worth reconsidering. On Sun, May 31, 2015 at 22:49 Chris Angelico <rosuav@gmail.com> wrote:

On Sun, May 31, 2015, at 23:21, Jim Witschey wrote:
I think hiding hardware number implementations has been a success for integers - it's a far superior API. It could be for rationals as well.
I'd worry about unbounded complexity. For rationals, unlike integers, values don't have to be large for their bignum representation to be large.
Also, it raises a question of string representation. Granted, "1/3" becomes much more defensible as the repr of Fraction(1, 3) if it in fact evaluates to that value, but how much do you like "6/5" as the repr of 1.2? Or are we going to use Fractions for integer division and Decimals for literals? And, what of decimal division? Right now you can't even mix Fraction and Decimal in arithmetic operations. And are we going to add %e %f and %g support for both types? Directly so, without any detour to float and its limitations (i.e. %.100f gets you 100 true decimal digits of precision)? Current reality:
Okay, that's one case right out of four.

On Sun, May 31, 2015 at 11:37 PM, <random832@fastmail.us> wrote:
I'd expect rational representations to be reasonably small until a value was operated on many times, in which case you're using more space, but representing the result very precisely. It's a tradeoff, but with a small cost in the common case. I'm no expert, though -- am I not considering some case?
how much do you like "6/5" as the repr of 1.2?
6/5 is an ugly representation of 1.2, but consider the current state of affairs:
1.2 1.2
"1.2" is imprecisely interpreted as 1.2000000476837158 * (2^0), which is then imprecisely represented as 1.2. I recognize this is the way we've dealt with non-integer numbers for a long time, but "1.2" => SomeKindOfRational(6, 5) => "6/5" is conceptually cleaner.
Or are we going to use Fractions for integer division and Decimals for literals?
I had been thinking of rationals built on bignums all around, a la Haskell. Is Fraction as it exists today up to it? I don't know. I agree that some principled decisions would have to be made for, e.g., interpretation by format strings.

On May 31, 2015, at 20:37, random832@fastmail.us wrote:
That's the big problem. There's no one always-right answer. If you interpret the literal 1.20 a Fraction, it's going to be more confusing, not less, to people who are just trying to add up dollars and cents. Do a long financial computation and, instead of $691.05 as you expected or $691.0500000237 as you get today, you've got 10215488088 / 14782560. Not to mention that financial calculations often tend to involve things like e or exponentiation to non-integral powers, and what happens then? And then of course there's the unbounded size issue. If you do a long chain of operations that can theoretically be represented exactly followed by one that can't, you're wasting a ton of time and space for those intermediate values (and, unlike Haskell, Python can't look at the whole expression in advance and determine what the final type will be). On other other hand, if you interpret 1.20 it as a Decimal, now you can't sensibly mix 1.20 * 3/4 without coming up with a rule for how decimal and fraction types should interact. (OK, there's an obvious right answer for multiplication, but what about for addition?) And either one leads to people asking why the code they ported from Java or Ruby is broken on Python. You could make it configurable, so integer division is your choice of float, fraction, or decimal and decimal literals are your separate choice of the same three (and maybe also let fraction exponentiation be your choice of decimal and float), but then which setting is the default? Also, where do you set that? It has to be available at compile time, unless you want to add new types like "decimal literal" at compile time that are interpreted appropriately at runtime (which some languages do, and it works, but it definitely adds complexity). Maybe the answer is just to make it easier to be explicit, using something like C++ literal suffixes, so you can write, e.g., 1.20d or 1/3f (and I guess 1.2f) instead of Decimal('1.20') or Fraction(1, 3) (and Fraction(12, 10)).
At least here I think the answer is clear. %-substitution is printf-like, and shouldn't change. If you want formatting that can be overloaded by the type, you use {}, which already works.

Having some sort of decimal literal would have some advantages of its own, for one it could help against this sillyness:
Decimal(1.3) Decimal('1.3000000000000000444089209850062616169452667236328125')
Decimal('1.3') Decimal('1.3')
I'm not saying that the actual data type needs to be a decimal ( might well be a float but say shove the string repr next to it so it can be accessed when needed) ..but this is one really common pitfall for new users, i know its easy to fix the code above, but this behavior is very unintuitive.. you essentially get a really expensive float when you do the obvious thing. Not sure if this is worth the effort but it would help smooth some corners potentially..

On 01/06/2015 15:52, Joonas Liik wrote:
Far easier to point them to https://docs.python.org/3/library/decimal.html and/or https://docs.python.org/3/tutorial/floatingpoint.html -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

On Mon, Jun 01, 2015 at 05:52:35PM +0300, Joonas Liik wrote:
Why is that silly? That's the actual value of the binary float 1.3 converted into base 10. If you want 1.3 exactly, you can do this:
Decimal('1.3') Decimal('1.3')
Is that really so hard for people to learn?
You want Decimals to *lie* about what value they have? I think that's a terrible idea, one which would lead to a whole set of new and exciting surprises when using Decimal. Let me try to predict a few of the questions on Stackoverflow which would follow this change... Why is equality so inaccurate in Python? py> x = Decimal(1.3) py> y = Decimal('1.3') py> x, y (Decimal('1.3'), Decimal('1.3')) py> x == y False Why does Python insert extra digits into numbers when I multiply? py> x = Decimal(1.3) py> x Decimal('1.3') py> y = 10000000000000000*x py> y - 13000000000000000 Decimal('0.444089209850062616169452667236328125')
Then don't do the obvious thing. Sometimes there really is no good alternative to actually knowing what you are doing. Floating point maths is inherently hard, but that's not the problem. There are all sorts of things in programming which are hard, and people learn how to deal with them. The problem is that people *imagine* that floating point is simple, when it is not and can never be. We don't do them any favours by enabling that delusion. If your needs are light, then you can ignore the complexities of floating point. You really can go a very long way by just rounding the results of your calculations when displaying them. But for anything more than that, we cannot just paper over the floating point complexities without creating new complexities that will burn people. You don't have to become a floating point guru, but it really isn't onerous to expect people who are programming to learn a few basic programming skills, and that includes a few basic coping strategies for floating point. -- Steve

On 02.06.2015 03:37, Steven D'Aprano wrote:
Joonas, I think you're approaching this from the wrong angle. People who want to get an exact decimal from a literal, will use the string representation to define it, not a float representation. In practice, you typically read the data from some file or stream anyway, so it already comes as string value and if you want to convert an actual float to a decimal, this will most likely not be done in a literal way, but instead by passed in to the Decimal constructor as variable, so there's no literal involved. It may be good to provide some alternative ways of converting a float to a decimal, e.g. one which uses the float repr logic to overcome things like repr(float(1.1)) == '1.1000000000000001' instead of a direct conversion:
These could be added as parameter to the Decimal constructor. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 02 2015)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

I think there is another discussion to have here, and that is making Decimal part of the language (__builtin(s)__) vs. part of the library (which implementations can freely omit). If it were part of the language, then maybe, just maybe, a literal syntax should be considered. As it stands, Decimal and Fraction are libraries - implementations of python are free to omit them (as I think some of the embedded platform implementations do), and it currently does not make a lick of sense to add syntax for something that is only in the library. On 6/2/2015 04:19, M.-A. Lemburg wrote:

On 2 June 2015 at 19:38, Alexander Walters <tritium-list@sdamon.com> wrote:
For decimal, the issues that keep it from becoming a literal are similar to those that keep it from becoming a builtin: configurable contexts are a core part of the decimal module's capabilities, and making a builtin type context dependent causes various problems when it comes to reasoning about a piece of code based on purely local information. Those problems affect human readers regardless, but once literals enter the mix, they affect all compile time processing as well. On that front, I also finally found the (mammoth) thread from last year about the idea of using base 10 for floating point values by default: https://mail.python.org/pipermail/python-ideas/2014-March/026436.html One of the things we eventually realised in that thread is that the context dependence problem, while concerning for a builtin type, is an absolute deal breaker for literals, because it means you *can't constant fold them* by calculating the results of expressions at compile time and store the result directly into the code object (https://mail.python.org/pipermail/python-ideas/2014-March/026998.html). This problem is illustrated by asking the following question: What is the result of "Decimal('1.0') + Decimal('1e70')"? Correct answer? Insufficient data (since we don't know the current decimal precision). With the current decimal module, the configurable rounding behaviour is something you just need to learn about as part of adopting the module. Since that configurability is one of the main reasons for using it over binary floating point, that's generally not a big deal. It becomes a much bigger deal when the question being asked is: What is the result of "1.0d + 1e70d"? Those look like they should be numeric constants, and hence the compiler should be able to constant fold them at compile time. That's possible if we were to pick a single IEEE decimal type as a builtin (either decimal64 or decimal128), but not possible if we tried to use the current variable precision decimal type. One of the other "fun" discrepancies introduced by the context sensitive processing in decimals is that unary plus and minus are context-sensitive, which means that any literal format can't express arbitrary negative decimal values without a parser hack to treat the minus sign as part of the trailing literal. This is one of the other main reasons why decimal64 or decimal128 are better candidates for a builtin decimal type than decimal.Decimal as it exists today (as well as being potentially more amenable to hardware acceleration on some platforms). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jun 2, 2015, at 05:44, random832@fastmail.us wrote:
The issue here isn't really binary vs. decimal, but rather that float implements a specific fixed-precision (binary) float type, and Decimal implements a configurable-precision (decimal) float type. As Nick explained elsewhere in that message, decimal64 or decimal128 wouldn't have the context problem. And similarly, a binary.Binary type designed like decimal.Decimal would have the context problem. (This is a slight oversimplification; there's also the fact that Decimal implements the full set of 754-2008 context features, while float implements a subset of 754-1985 features, and even that only if the underlying C lib does so, and nobody ever uses them anyway.)

On Jun 2, 2015, at 05:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
OK, so what are the stumbling blocks to adding decimal32/64/128 (or just one of the three), either in builtins/stdtypes or in decimal, and then adding literals for them? I can imagine a few: someone has to work out exactly what features to support (the same things as float, or everything in the standard?), how it interacts with Decimal and float (which is defined by the standard, but translating that to Python isn't quite trivial), how it fits into the numeric tower ABCs, and what syntax to use for the literals, and if/how it fits into things like array/struct/ctypes and into math, and whether we need decimal complex values, and what the C API looks like (it would be nice if PyDecimal64_AsDecimal64 worked as expected on C11 platforms, but you could still use decimal64 on C90 platforms and just not get such functions...); then write a PEP; then write an implementation; and after all that work, the result may be seen as too much extra complexity (either in the language or in the implementation) for the benefits. But is that it, or is there even more that I'm missing? (Of course while we're at it, it would be nice to have arbitrary-precision IEEE binary floats as well, modeled on the decimal module, and to add all the missing 754-2008/C11 methods/math functions for the existing float type, but those seem like separate proposals from fixed-precision decimal floats.)

On 2 June 2015 at 14:05, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
I would argue that it should be as simple as float. If someone wants the rest of it they've got the Decimal module which is more than enough for their needs.
how it interacts with Decimal and float (which is defined by the standard, but translating that to Python isn't quite trivial),
Interaction between decimalN and Decimal coerces to Decimal. Interaction with floats is a TypeError.
how it fits into the numeric tower ABCs,
Does anyone really use these for anything? I haven't really found them to be very useful since no third-party numeric types use them and they don't really define the kind of information that you might really want in any carefully written numerical algorithm. I don't see any gain in adding any decimal types to e.g Real as the ABCs seem irrelevant to me.
and what syntax to use for the literals, and if/how it fits into things like array/struct/ctypes
It's not essential to incorporate them here. If they become commonly used in C then it would be good to have these for binary compatibility.
and into math, and whether we need decimal complex values,
It's not common to use the math-style functions with the decimal module unless you're using it as a multi-precision library and then you'd really want the full Decimal type. There's no advantage in using decimal for e.g. sin, cos etc. so there's not much really lost in converting to binary and back. It's in the simple arithmetic where it makes a difference so I'd say that decimal should stick to that. As for complex decimals this would only really be worth it if the ultimate plan was to have decimals become the default floating point type. Laura suggested that earlier and I probably agree that it would have been a good idea at some earlier time but it's a bit late for that.
and what the C API looks like (it would be nice if PyDecimal64_AsDecimal64 worked as expected on C11 platforms, but you could still use decimal64 on C90 platforms and just not get such functions...);
Presumably CPython would have to write it's own implementation e.g.: PyDecimal64_FromIntExponentAndLongSignificand ... or something like that.
then write a PEP; then write an implementation; and after all that work, the result may be seen as too much extra complexity (either in the language or in the implementation) for the benefits. But is that it, or is there even more that I'm missing?
I don't think anyone has proposed to add all of the things that you suggested. Of course if there are decimal literals and a fixed-width decimal type then over time people will suggest some of the other things. That doesn't mean that they'd need to be incorporated though. A year ago I said I'd write a PEP for decimal literals but then I got clobbered at work and a number of other things happened so that I didn't even have time to read threads like this. Maybe it's worth revisiting... Oscar

On Jun 2, 2015, at 07:05, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
But decimal64 and Decimal are not the same types. So, if you want to, e.g., get the next decimal64 value after the current value, how would you do that? (Unless you're suggesting there should be a builtin decimal64 and a separate decimal.decimal64 or something, but I don't think you are.) Also, with float, we can get away with saying we're supporting the 1985 standard and common practice among C90 implementations; with decimal64, the justification for arbitrarily implementing part of the 2008 standard but not the rest is not as clear-cut.
how it interacts with Decimal and float (which is defined by the standard, but translating that to Python isn't quite trivial),
Interaction between decimalN and Decimal coerces to Decimal.
Even when the current decimal context is too small to hold a decimalN? Does that raise any flags?
The NumPy native types do. (Of course they also subclass int and float even relevant.)
Even if they are completely irrelevant, unless they're deprecated they pretty much have to be supported by any new types. There might be a good argument that decimal64 doesn't fit into the numeric tower, but you'd have to make that argument.
For ctypes, sure (although even there, ctypes is a relatively simple way to share values between pure-Python child processes with multiprocessing.shared_ctypes). But for array, that's generally not about compatibility with existing C code, it's about efficiently packing zillions of homogenous simple values into as little memory as possible.
Well, math is mostly about double functions from the C90 stdlib, so it's not common to use them with decimal. But that doesn't mean you wouldn't want decimal64 implementations of some of the functions in math.
unless you're using it as a multi-precision library and then you'd really want the full Decimal type.
But again, the full Decimal type isn't just an expansion on decimal64, it's a completely different type, with context-sensitive precision.
There's still rounding error. Sure, usually that won't make a difference--but when it does, it will be surprising and frustrating if you didn't explicitly ask for it.
Why?
Sure, if you want a C API for C90 platforms at all. But you may not even need that. When would you need to write C code that deals with decimal64 values as exponent and significant? Dealing with them as abstract numbers, general Python objects, native decimal64, and maybe even opaque values that I can pass around in C without being able to interpret them, I can see, but what C code needs the exponent and significand?
I think in many (but maybe not all) of these cases the simplest answer is the best, but a PEP would have to actually make that case for each thing.
Maybe we need a PEP for the decimalN type(s) first, then if someone has time and inclination they can write a PEP for literals for those types, either as a companion or as a followup. That would probably cut out 30-50% of the work, and maybe even more of the room for argument and bikeshedding.

On 6/2/2015 9:05 AM, Andrew Barnert via Python-ideas wrote:
A compelling rationale. Python exposes the two basic number types used by the kinds of computers it runs on: integers (extended) and floats (binary in practice, though the language definition would all a decimal float machine). The first killer app for Python was scientific numerical computing. The first numerical package developed for this exposed the entire gamut of integer and float types available in C. Numpy is the third numerical package. (Even so, none of the packages have been distributed with CPython -- and properly so.) Numbers pre-wrapped as dates, times, and datetimes with specialized methods are not essential (Python once managed without) but are enormously useful in a wide variety of application areas. Decimals, another class of pre-wrapped numbers, greatly simplify money calculations, including those that must follow legal or contractual rules. It is no accident that the decimal specification is a product of what was once International Business Machines. Contexts and specialized rounding rules are an essential part of fulfilling the purpose of the module. What application area would be opened up by adding a fixed-precision float? The only thing I have seen presented is making interactive python act even more* like a generic (decimal) calculator, so that newbies will find python floats less surprising that those of other languages. (Of course, a particular decimal## might not exactly any existing calculator.) *The int division change solved the biggest discrepancy: 1/10 is not .1 instead of 0. Representation changes improved things also. -- Terry Jan Reedy

On Jun 2, 2015, at 02:38, Alexander Walters <tritium-list@sdamon.com> wrote:
I think there is another discussion to have here, and that is making Decimal part of the language (__builtin(s)__) vs. part of the library (which implementations can freely omit).
I don't think there is any such distinction in Python. Neither the language reference nor the library reference claims to be a specification. The library documentation specifically says that it "describes the standard library that is distributed with Python" and "also describes some of the optional components that are commonly included in Python distributions", which implies that, except for the handful of modules that are described as optional or platform-specific, everything should always be there. (There is special dispensation for Unix systems to split up Python into separate packages, but even that is specifically limited to "some or all of the optional components".) Historically, implementations that haven't included the entire stdlib also haven't included parts of the language (Jython 2.2 and early 2.5 versions, early versions of PyPy, the various browser-based implementations, MicroPython and PyMite, etc.). Also, both the builtins module and the actual built-in functions, constants, types, and exceptions it contains are documented as part of the library, just like decimal, not as part of the language. So, Python isn't like C, with separate specifications for "freestanding" vs. "hosted" implementations, and it doesn't have a separate specification for an "embedded" subset like C++ used to.
If it were part of the language, then maybe, just maybe, a literal syntax should be considered.
Since there is no such distinction between language and library, I think we're free to define a literal syntax for decimals and fractions. From a practical point of view (which beats purity, of course), it's probably not reasonable for CPython to define such literals unless there's a C implementation that defines the numeric type slot (and maybe even has a C API concrete type interface, although maybe not), and which can be "frozen" at build time. (See past discussions on adding an OrderedDict literal for why these things are important.) That's currently true for Decimal, but not for Fraction. So, that might be an argument against fraction literals, or for providing a C implementation of the fraction module.
As it stands, Decimal and Fraction are libraries - implementations of python are free to omit them (as I think some of the embedded platform implementations do), and it currently does not make a lick of sense to add syntax for something that is only in the library.
Even besides the four library sections on the various kinds of built-in things, plenty of other things are syntax for something that's "only in the library". The import statement is defined in terms of functionality in importlib, and (at least in CPython) actually implemented that way. In fact, numeric values, as defined in the data model section of the language reference, are defined in terms of types from the library docs, both in stdtypes and in the numbers module. Defining decimal values in terms of types defined in the decimal module library section would be no different. (Numeric _literals_ don't seem to have their semantics defined anywhere, just their syntax, but it's pretty obvious from the wording that they're intended to have int, float, and complex values as defined by the data model--which, again, means as defined by the library.) So, while there are potentially compelling arguments against a decimal literal (how it interacts with contexts may be confusing, the idea may be too bikesheddable to come up with one true design that everyone will like, or may be an attractive nuisance, it may add too much complexity to the implementation for the benefit, etc.), "decimal is only a library" doesn't seem to be one.

Joonas Liik writes:
Having some sort of decimal literal would have some advantages of its own, for one it could help against this sillyness:
That *would* be a different type from float. You may as well go all the way to Decimal.
..but this is one really common pitfall for new users,
To fix it, you really need to change the parser, i.e., make Decimal the default type for non-integral numbers. "Decimal('1.3')" isn't that much harder to remember than "1.3$" (although it's quite a bit more to type). But people are going to continue writing things like pennies = 13 pennies_per_dollar = 100 dollars = pennies / pennies_per_dollar # Much later ... future_value = dollars * Decimal('1.07') And in real applications you're going to be using Decimal in code like def inputDecimals(file): for row, line in enumerate(file): for col, value in enumerate(line.strip().split()): matrix[row][col] = Decimal(value) or def what_if(): principal = Decimal(input("Principal ($): ")) rate = Decimal(input("Interest rate (%): ")) print("Future value is ", principal * (1 + rate/100), ".", sep="") and the whole issue evaporates.

On Mon, Jun 1, 2015 at 9:21 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Shudder indeed.
You may as well go all the way to Decimal.
Or perhaps switch to decimal64 ( http://en.wikipedia.org/wiki/Decimal64_floating-point_format)? (Or its bigger cousing, decimal128) -- --Guido van Rossum (python.org/~guido)

On Tue, Jun 2, 2015, at 00:31, Guido van Rossum wrote:
Does anyone know if any common computer architectures have any hardware support for this? Are there any known good implementations for all the functions in math/cmath for these types? Moving to a fixed-size floating point type does have the advantage of not requiring making all these decisions about environments and precision and potentially unbounded growth etc.

On Jun 1, 2015, at 21:47, random832@fastmail.us wrote:
IBM's RS/POWER architecture supports decimal32, 64, and 128. The PowerPC and Cell offshoots only support them in some models, not all. Is that common enough? (Is _anything_ common enough besides x86, x86_64, ARM7, ARM8, and various less-capable things like embedded 68k variants?)
Are there any known good implementations for all the functions in math/cmath for these types?
Intel wrote a reference implementation for IEEE 754-2008 as part of the standardization process. And since then, they've focused on improvements geared at making it possible to write highly-optimized financial applications in C or C++ that run on x86_64 hardware. And I think it's BSD-licensed. It's available somewhere on netlib, but searching that repo is no fun on my phone (plus, most of Intel's code, you can't see the license or the detailed README until you unpack it...), so I'll leave it to someone else to find it. Of course 754-2008 isn't necessarily identical to GDAS (which is what POWER implements, and Python's decimal module).

Why I see the interest, does it really belong in core Python ? What would be the advantages ? IIRC (during | after) the language submit at PyCon this year, it was said that maybe the stdlib should get less features, not more. Side note, Sympy as a IPython ast-hook that will wrap all your integers into SymPy Integers and hence give you rationals of whatever you like, if you want to SymPy-plify your life. But for majority of use will it be useful ? What would be the performance costs ? If you start into stroring rationals, then why not continued fraction, as they are just a N-tuple, instead of 2-tuples. but then you are limited to non-infinite continued fraction. So you improve by using generator... I love Python for doing science and math, but please stay away from putting too much in standard lib, or we will end up with cholesky matrix decomposition in Python 4.0 like Julia does… and I’m not sure it is a good idea. I would much rather have a core set of library “blessed” by CPython that provide features like this one, that are deemed “important”. — M

On Sun, May 31, 2015 at 11:46 PM, Matthias Bussonnier <bussonniermatthias@gmail.com> wrote:
IIRC (during | after) the language submit at PyCon this year, it was said that maybe the stdlib should get less features, not more.
Rationals (and Decimals) already exist in the standard library. The original proposal (as I read it, anyway) is more about the default interpretation of, e.g., integer division and decimal-number literals.
Side note, Sympy as a IPython ast-hook that will wrap all your integers into SymPy Integers and hence give you rationals of whatever you like, if you want to SymPy-plify your life.
Thank you for the pointer -- that's really cool.
But for majority of use will it be useful ?
I believe interpreting "0.1" as 1/10 is more ergonomic than representing it as 1.600000023841858 * (2^-4). I see it as being more useful -- a better fit -- in most use cases because it's simpler, more precise, and more understandable.
What would be the performance costs ?
I don't know. Personally, I'd be willing to pay a performance penalty to avoid reasoning about floating-point arithmetic most of the time, then "drop into" floats when I need the speed.

I don’t know. Personally, I’d be willing to pay a performance penalty to avoid reasoning about floating-point arithmetic most of the time, then “drop into” floats when I need the speed. This is perhaps a bit off topic for the thread, but +9000 for this. Having decimal literals or something similar by default, though perhaps problematic from a backwards compatibility standpoint, is a) user friendly, b) easily understandable, and c) not surprising to beginners. None of these qualities apply to float literals. I always assumed that float literals were mostly an artifact of history or of some performance limitations. Free of those, why would a language choose them over decimal literals? When does someone ever expect floating-point madness, unless they are doing something that is almost certainly not common, or unless they have been burned in the past? Every day another programmer gets bitten by floating point stupidities like this one <http://stackoverflow.com/q/588004/877069>. It would be a big win to kill this lame “programmer rite of passage” and give people numbers that work more like how they learned them in school. The competing proposal is to treat decimal literals as decimal.Decimal values. I’m interested in learning more about such a proposal. Nick On Mon, Jun 1, 2015 at 2:03 AM Jim Witschey <jim.witschey@gmail.com> wrote:

On 1 June 2015 at 16:27, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
In a world of binary computers, no programming language is free of those constraints - if you choose decimal literals as your default, you take a *big* performance hit, because computers are designed as binary systems. (Some languages, like IBM's REXX, do choose to use decimal integers by default) For CPython, we offer C-accelerated decimal support by default since 3.3 (available as pip install cdecimal in Python 2), but it still comes at a high cost in speed: $ python3 -m timeit -s "n = 1.0; d = 3.0" "n / d" 10000000 loops, best of 3: 0.0382 usec per loop $ python3 -m timeit -s "from decimal import Decimal as D; n = D(1); d = D(3)" "n / d" 10000000 loops, best of 3: 0.129 usec per loop And this isn't even like the situation with integers, where the semantics of long integers are such that native integers can be used transparently as an optimisation technique - IEEE754 (which defines the behaviour of native binary floats) and the General Decimal Arithmetic Specification (which defines the behaviour of the decimal module) are genuinely different ways of doing floating point arithmetic, since the choice of base 2 or base 10 has far reaching ramifications for the way various operations work and how various errors accumulate. We aren't even likely to see widespread proliferation of hardware level decimal arithmetic units, because the "binary arithmetic is easier to implement than decimal arithmetic" consideration extends down to the hardware layer as well - a decimal arithmetic unit takes more silicon, and hence more power, than a similarly capable binary unit. With battery conscious mobile device design and environmentally conscious data centre design being two of the most notable current trends in CPU design, this makes it harder than ever to justify providing hardware support for both in general purpose computing devices. For some use cases (e.g. financial math), it's worth paying the price in speed to get the base 10 arithmetic semantics, or the cost in hardware to accelerate it, but for most other situations, we end up being better off teaching humans to cope with the fact that binary logic is the native language of our computational machines. Binary vs decimal floating point is a lot like the Unicode bytes/text distinction in that regard: while Unicode is a better model for representing human communications, there's no avoiding the fact that that text eventually has to be rendered as a bitstream in order to be saved or transmitted. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Decimal literals are far from as obvious as suggested. We *have* the `decimal` module after all, and it defines all sorts of parameters on precision, rounding rules, etc. that one can provide context for. decimal.ROUND_HALF_DOWN is "the obvious way" for some users, while decimal.ROUND_CEILING is "the obvious way" for others. I like decimals, but they don't simply make all the mathematical answers result in what all users would would consider "do what I mean" either. On Sun, May 31, 2015 at 11:27 PM, Nicholas Chammas < nicholas.chammas@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

I'm sorry.. what i meant was not a literal that results in a Decimal, what i meant was a special literal proxy object that usualyl acts like a float except you can ask for its original string form. eg: flit = 1.3 flit*3 == float(flit)*3 str(flit) == '1.3' thus in cases where the intermediate float conversion loses precision you can get at the original string that the programmer actually typed in. Decimal constructors are one case that woudl probably like to use the original string whenever possible to avoid conversion losses, but by no means are they the only ones.

On Jun 1, 2015, at 08:12, Joonas Liik <liik.joonas@gmail.com> wrote:
I'm sorry..
what i meant was not a literal that results in a Decimal, what i meant was a special literal proxy object that usualyl acts like a float except you can ask for its original string form.
This is essentially what I was saying with new "literal constant" types. Swift is probably the most prominent language with this feature. http://nshipster.com/swift-literal-convertible/ is a good description of how it works. Many of the reasons Swift needed this don't apply in Python. For example, in Swift, it's how you can build a Set at compile time from an ArrayLiteral instead of building an Array and converting it to Set at compile time. Or how you can use 0 as a default value for a non-integer type without getting a TypeError or a runtime conversion. Or how you can build an Optional that acts like a real ADT but assign it nil instead of a special enumeration value. Or how you can decode UTF-8 source text to store in UTF-16 or UTF-32 or grapheme-cluster at compile time. And so on.

Sorry, I accidentally sent that before it was done... Sent from my iPhone
Anyway, my point was that the Swift feature is complicated, and has some controversial downsides (e.g., see the example at the end of silently using a string literal as if it were a URL by accessing an attribute of the NSURL class--which works given the Smalltalk-derived style of OO, but many people still find it confusing). But the basic idea can be extracted out and Pythonified: The literal 1.23 no longer gives you a float, but a FloatLiteral, which is either a subclass of float, or an unrelated class that has a __float__ method. Doing any calculation on it gives you a float. But as long as you leave it alone as a FloatLiteral, it has its literal characters available for any function that wants to distinguish FloatLiteral from float, like the Decimal constructor. The problem that Python faces that Swift doesn't is that Python doesn't use static typing and implicit compile-time conversions. So in Python, you'd be passing around these larger values and doing the slow conversions at runtime. That may or may not be unacceptable; without actually building it and testing some realistic programs it's pretty hard to guess. The advantage of C++-style user-defined literal suffixes is that the absence of a suffix is something the compiler can see, so 1.23d might still require a runtime call, but 1.23 just is compiled as a float constant the same as it's been since Python 1.x.

On 2 Jun 2015 08:44, "Andrew Barnert via Python-ideas" <python-ideas@python.org> wrote:
Joonas's suggestion of storing the original text representation passed to the float constructor is at least a novel one - it's only the idea of actual decimal literals that was ruled out in the past. Aside from the practical implementation question, the main concern I have with it is that we'd be trading the status quo for a situation where "Decimal(1.3)" and "Decimal(13/10)" gave different answers. It seems to me that a potentially better option might be to adjust the implicit float->Decimal conversion in the Decimal constructor to use the same algorithm as we now use for float.__repr__ [1], where we look for the shortest decimal representation that gives the same answer when rendered as a float. At the moment you have to indirect through str() or repr() to get that behaviour:
Cheers, Nick. [1] http://bugs.python.org/issue1580

On Jun 1, 2015, at 17:08, Nick Coghlan <ncoghlan@gmail.com> wrote:
I actually built about half an implementation of something like Swift's LiteralConvertible protocol back when I was teaching myself Swift. But I think I have a simpler version that I could implement much more easily. Basically, FloatLiteral is just a subclass of float whose __new__ stores its constructor argument. Then decimal.Decimal checks for that stored string and uses it instead of the float value if present. Then there's an import hook that replaces every Num with a call to FloatLiteral. This design doesn't actually fix everything; in effect, 1.3 actually compiles to FloatLiteral(str(float('1.3')) (because by the time you get to the AST it's too late to avoid that first conversion). Which does actually solve the problem with 1.3, but doesn't solve everything in general (e.g., just feed in a number that has more precision than a double can hold but less than your current decimal context can...). But it just lets you test whether the implementation makes sense and what the performance effects are, and it's only an hour of work, and doesn't require anyone to patch their interpreter to play with it. If it seems promising, then hacking the compiler so 2.3 compiles to FloatLiteral('2.3') may be worth doing for a test of the actual functionality. I'll be glad to hack it up when I get a chance tonight. But personally, I think decimal literals are a better way to go here. Decimal(1.20) magically doing what you want still has all the same downsides as 1.20d (or implicit decimal literals), plus it's more complex, adds performance costs, and doesn't provide nearly as much benefit. (Yes, Decimal(1.20) is a little nicer than Decimal('1.20'), but only a little--and nowhere near as nice as 1.20d).
Yes, to solve that you really need Decimal(13)/Decimal(10)... Which implies that maybe the simplification in Decimal(1.3) is more misleading than helpful. (Notice that this problem also doesn't arise for decimal literals--13/10d is int vs. Decimal division, which is correct out of the box. Or, if you want prefixes, d13/10 is Decimal vs. int division.)

On Jun 1, 2015, at 18:27, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Make that 15 minutes. https://github.com/abarnert/floatliteralhack

On Jun 1, 2015, at 19:00, Andrew Barnert <abarnert@yahoo.com> wrote:
And as it turns out, hacking the tokens is no harder than hacking the AST (in fact, it's a little easier; I'd just never done it before), so now it does that, meaning you really get the actual literal string from the source, not the repr of the float of that string literal. Turning this into a real implementation would obviously be more than half an hour's work, but not more than a day or two. Again, I don't think anyone would actually want this, but now people who think they do have an implementation to play with to prove me wrong.

On Tue, Jun 02, 2015 at 10:08:37AM +1000, Nick Coghlan wrote:
Apart from the questions of whether such a change would be allowed by the Decimal specification, and the breaking of backwards compatibility, I would really hate that change for another reason. At the moment, a good, cheap way to find out what a binary float "really is" (in some sense) is to convert it to Decimal and see what you get: Decimal(1.3) -> Decimal('1.3000000000000000444089209850062616169452667236328125') If you want conversion from repr, then you can be explicit about it: Decimal(repr(1.3)) -> Decimal('1.3') ("Explicit is better than implicit", as they say...) Although in fairness I suppose that if this change happens, we could keep the old behaviour in the from_float method: # hypothetical future behaviour Decimal(1.3) -> Decimal('1.3') Decimal.from_float(1.3) -> Decimal('1.3000000000000000444089209850062616169452667236328125') But all things considered, I don't think we're doing people any favours by changing the behaviour of float->Decimal conversions to implicitly use the repr() instead of being exact. I expect this strategy is like trying to flatten a bubble under wallpaper: all you can do is push the gotchas and surprises to somewhere else. Oh, another thought... Decimals could gain yet another conversion method, one which implicitly uses the float repr, but signals if it was an inexact conversion or not. Explicitly calling repr can never signal, since the conversion occurs outside of the Decimal constructor and Decimal sees only the string: Decimal(repr(1.3)) cannot signal Inexact. But: Decimal.from_nearest_float(1.5) # exact Decimal.from_nearest_float(1.3) # signals Inexact That might be useful, but probably not to beginners. -- Steve

On Jun 1, 2015, at 18:58, Steven D'Aprano <steve@pearwood.info> wrote:
As far as I know, GDAS doesn't specify anything about implicit conversion from floats. As long as the required explicit conversion function (which I think is from_float?) exists and does the required thing. As a side note, has anyone considered whether it's worth switching to IEEE-754-2008 as the controlling specification? There may be a good reason not to do so; I'm just curious whether someone has thought it through and made the case.
I think this might be worth having whether the default constructor is changed or not. I can't think of too many programs where I'm pretty sure I have an exactly-representable decimal as a float but want to check to be sure... but for interactive use in IPython (especially when I'm specifically trying to explain to someone why just using Decimal instead of float will/will not solve their problem) I could see using it.

On 2 June 2015 at 13:10, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
As far as I know, nobody has looked into it. If there aren't any meaningful differences, we should just switch, if there are differences, we should probably switch anyway, but it will be more work (and hence will require volunteers willing to do that work). Either way, the starting point would be an assessment of what the differences are, and whether or not they have any implications for the decimal module and cdecimal. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Monday, June 1, 2015 10:23 PM, "random832@fastmail.us" <random832@fastmail.us> wrote:
Does IEEE even have anything about arbitrary-precision decimal types
(which are what decimal/cdecimal are)?
Yes. When many people say "IEEE float" they still mean 754-1985. This is what C90 was designed to "support without quite supporting", and what C99 explicitly supports, and what many consumer FPUs support (or, in the case of the 8087 and its successors, a preliminary version of the 1985 standard). That standard did not cover either arbitrary precision or decimals; both of those were only part of the companion standard 854 (which isn't complete enough to base an implementation on). But the current version of the standard, 754-2008, does cover arbitrary-precision decimal types. If I understand the relationship between the standards: 754-2008 was designed to merge 754-1985 and 854-1987, fill in the gaps, and fix any bugs; GDAS was a major influence (the committee chair was GDAS's author); and since 2009 GDAS has gone from being a de facto independent standard to being a more-specific specification of the relevant subset of 754-2008. IBM's hardware and Java library implement GDAS (and therefore implicitly the relevant part of 754-2008); Itanium (partly), C11, the gcc extensions, and Intel's C library implement 754-2008 (or IEC 60559, which is just a republished 754-2008). So, my guess is that GDAS makes perfect sense to follow unless Python wants to expose C11's native fixed decimals, or the newer math.h functions from C99/C11/C14, or the other parts of 754-2008 that it doesn't support (like arbitrary-precision binary). My question was just whether someone had actually made that decision, or whether decimal is following GDAS just because that was the obvious decision to make in 2003.

On 02.06.2015 08:40, Andrew Barnert via Python-ideas wrote:
The IBM decimal implementation by Mike Cowlishaw was chosen as basis for the Python's decimal implementation at the time, so yes, this was an explicit design choice at the time: http://legacy.python.org/dev/peps/pep-0327/ http://speleotrove.com/decimal/ According to the PEP, decimal implements IEEE 854-1987 (with some restrictions). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 02 2015)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

Andrew Barnert wrote:
How about more general Decimal.from_exact that does the same for argument of any type – float, int, Decimal object with possibly different precission, fraction, string. Just convert the argument to Decimal and signal if it cannot be done losslessly. The same constructor with the same semantics could be added to int, float, Fraction as well. Regards, Drekin

Stephen J. Turnbull writes:
What if also 13/10 yielded a fraction? Anyway, what are the objections to integer division returning a fraction? They are coerced to floats when mixed with them. Also, the repr of Fraction class could be altered so repr(13 / 10) == "13 / 10" would hold. Regards, Drekin

On Jun 3, 2015, at 07:29, drekin@gmail.com wrote:
That was raised near the start of the thread. In fact, I think the initial proposal was that 13/10 evaluated to Fraction(13, 10) and 1.2 evaluated to something like Fraction(12, 10).
Anyway, what are the objections to integer division returning a fraction? They are coerced to floats when mixed with them.
As mentioned earlier in the thread, the language that inspired Python, ABC, used exactly this design: computations were kept as exact rationals until you mixed them with floats or called irrational functions like root. So it's not likely Guido didn't think of this possibility; he deliberately chose not to do things this way. He even wrote about this a few years ago; search for "integer division" on his Python-history blog. So, what are the problems? When you stay with exact rationals through a long series of computations, the result can grow to be huge in memory, and processing time. (I'm ignoring the fact that CPython doesn't even have a fast fraction implementation, because one could be added easily. It's still going to be orders of magnitude slower to add two fractions with gigantic denominators than to add the equivalent floats or decimals.) Plus, it's not always obvious when you've lost exactness. For example, exponentiation between rationals is exact only if the power simplifies to a whole fraction (and hasn't itself become a float somewhere along the way). Since the fractions module doesn't have IEEE-style flags for inexactness/rounding, it's harder to notice when this happens. Except in very trivial cases, the repr would be much less human-readable and -debuggable, not more. (Or do you find 1728829813 / 2317409 easier to understand than 7460.181958816937?) Fractions and Decimals can't be mixed or interconverted directly. There are definitely cases where a rational type is the right thing to use (it wouldn't be in the stdlib otherwise), but I think they're less common than the cases where a floating-point type (whether binary or decimal) is the right thing to use. (And even many cases where you think you want rationals, what you actually want is SymPy-style symbolic computation--which can give you exact results for things with roots or sins or whatever as long as they cancel out in the end.)

On 2 Jun 2015 01:04, "David Mertz" <mertz@gnosis.cx> wrote:
Decimal literals are far from as obvious as suggested. We *have* the
`decimal` module after all, and it defines all sorts of parameters on precision, rounding rules, etc. that one can provide context for. decimal.ROUND_HALF_DOWN is "the obvious way" for some users, while decimal.ROUND_CEILING is "the obvious way" for others.
I like decimals, but they don't simply make all the mathematical answers
result in what all users would would consider "do what I mean" either. The last time we had a serious discussion about decimal literals, we realised the fact their behaviour is context dependent posed a significant problem for providing a literal form. With largely hardware provided IEEE754 semantics, binary floats are predictable, albeit somewhat surprising if you're expecting abstract math behaviour (i.e. no rounding errors), or finite base 10 representation behaviour. By contrast, decimal arithmetic deliberately allows for configurable contexts, presumably because financial regulations sometimes place strict constraints on how arithmetic is to be handled (e.g. "round half even" is also known as "banker's rounding", since it eliminates statistical bias in rounding financial transactions to the smallest supported unit of currency). That configurability makes decimal more fit for its primary intended use case (i.e. financial math), but also makes local reasoning harder - the results of some operations (even something as simple as unary plus) may vary based on the configured context (the precision, in particular). Cheers, Nick.

On Mon, Jun 01, 2015 at 06:27:57AM +0000, Nicholas Chammas wrote:
I wish this myth about Decimals would die, because it isn't true. The only advantage of base-10 floats over base-2 floats -- and I'll admit it can be a big advantage -- is that many of the numbers we commonly care about can be represented in Decimal exactly, but not as base-2 floats. In every other way, Decimals are no more user friendly, understandable, or unsurprising than floats. Decimals violate all the same rules of arithmetic that floats do. This should not come as a surprise, since decimals *are* floats, they merely use base 10 rather than base 2. In the past, I've found that people are very resistant to this fact, so I'm going to show a few examples of how Decimals violate the fundamental laws of mathematics just as floats do. For those who already know this, please forgive me belabouring the obvious. In mathematics, adding anything other than zero to a number must give you a different number. Decimals violate that expectation just as readily as binary floats: py> from decimal import Decimal as D py> x = D(10)**30 py> x == x + 100 # should be False True Apart from zero, multiplying a number by its inverse should always give one. Again, violated by decimals: py> one_third = 1/D(3) py> 3*one_third == 1 False Inverting a number twice should give the original number back: py> 1/(1/D(7)) == 7 False Here's a violation of the Associativity Law, which states that (a+b)+c should equal a+(b+c) for any values a, b, c: py> a = D(1)/17 py> b = D(5)/7 py> c = D(12)/13 py> (a + b) + c == a + (b+c) False (For the record, it only took me two attempts, and a total of about 30 seconds, to find that example, so it's not particularly difficult to come across such violations.) Here's a violation of the Distributive Law, which states that a*(b+c) should equal a*b + a*c: py> a = D(15)/2 py> b = D(15)/8 py> c = D(1)/14 py> a*(b+c) == a*b + a*c False (I'll admit that was a bit trickier to find.) This one is a bit subtle, and to make it easier to see what is going on I will reduce the number of digits used. When you take the average of two numbers x and y, mathematically the average must fall *between* x and y. With base-2 floats, we can't guarantee that the average will be strictly between x and y, but we can be sure that it will be either between the two values, or equal to one of them. But base-10 Decimal floats cannot even guarantee that. Sometimes the calculated average falls completely outside of the inputs. py> from decimal import getcontext py> getcontext().prec = 3 py> x = D('0.516') py> y = D('0.518') py> (x+y)/2 # should be 0.517 Decimal('0.515') This one is even worse: py> getcontext().prec = 1 py> x = D('51.6') py> y = D('51.8') py> (x+y)/2 # should be 51.7 Decimal('5E+1') Instead of the correct answer of 51.7, Decimal calculates the answer as 50 exactly.
Performance and accuracy will always be better for binary floats. Binary floats are faster, and have stronger error bounds and slower-growing errors. Decimal floats suffer from the same problems as binary floats, only more so, and are slower to boot.
There's a lot wrong with that. - The sorts of errors we see with floats are not "madness", but the completely logical consequences of what happens when you try to do arithmetic in anything less than the full mathematical abstraction. - And they aren't rare either -- they're incredibly common. Fortunately, most of the time they don't matter, or aren't obvious, or both. - Decimals don't behave like the numbers you learn in school either. Floats are not real numbers, regardless of which base you use. And in fact, the smaller the base, the smaller the errors. Binary floats are better than decimals in this regard. (Decimals *only* win out due to human bias: we don't care too much that 1/7 cannot be expressed exactly as a float using *either* binary or decimal, but we do care about 1/10. And we conveniently ignore the case of 1/3, because familiarity breeds contempt.) - Being at least vaguely aware of floating point issues shouldn't be difficult for anyone who has used a pocket calculator. And yet every day brings in another programmer surprised by floats. - It's not really a rite of passage, that implies that it is arbitrary and imposed culturally. Float issues aren't arbitrary, they are baked into the very nature of the universe. You cannot hope to perform infinitely precise real-number arithmetic using just a finite number of bits of storage, no matter what system you use. Fixed-point maths has its own problems, as does rational maths. All you can do is choose to shift the errors from some calculations to other calculations, you cannot eliminate them altogether. -- Steve

On 1 June 2015 at 15:58, Steven D'Aprano <steve@pearwood.info> wrote:
There is one other "advantage" to decimals - they behave like electronic calculators (which typically used decimal arithmetic). This is a variation of "human bias" - we (if we're of a certain age, maybe today's youngsters are less used to the vagaries of electronic calculators :-)) are used to seeing 1/3 displayed as 0.33333333, and showing that 1/3*3 = 0.99999999 was a "fun calculator fact" when I was at school. Paul

Well, I learned a lot about decimals today. :) On Mon, Jun 1, 2015 at 3:08 AM, Nick Coghlan ncoghlan@gmail.com <http://mailto:ncoghlan@gmail.com> wrote: In a world of binary computers, no programming language is free of those constraints - if you choose decimal literals as your default, you take a *big* performance hit, because computers are designed as binary systems. (Some languages, like IBM’s REXX, do choose to use decimal integers by default) I guess it’s a non-trivial tradeoff. But I would lean towards considering people likely to be affected by the performance hit as doing something “not common”. Like, if they are doing that many calculations that it matters, perhaps it makes sense to ask them to explicitly ask for floats vs. decimals, in exchange for giving the majority who wouldn’t notice a performance difference a better user experience. On Mon, Jun 1, 2015 at 10:58 AM, Steven D’Aprano steve@pearwood.info <http://mailto:steve@pearwood.info> wrote: I wish this myth about Decimals would die, because it isn’t true. Your email had a lot of interesting information about decimals that would make a good blog post, actually. Writing one up will perhaps help kill this myth in the long run :) In the past, I’ve found that people are very resistant to this fact, so I’m going to show a few examples of how Decimals violate the fundamental laws of mathematics just as floats do. How many of your examples are inherent limitations of decimals vs. problems that can be improved upon? Admittedly, the only place where I’ve played with decimals extensively is on Microsoft’s SQL Server (where they are the default literal <https://msdn.microsoft.com/en-us/library/ms179899.aspx>). I’ve stumbled in the past on my own decimal gotchas <http://dba.stackexchange.com/q/18997/2660>, but looking at your examples and trying them on SQL Server I suspect that most of the problems you show are problems of precision and scale. Perhaps Python needs better rules for how precision and scale are affected by calculations (here are SQL Server’s <https://msdn.microsoft.com/en-us/library/ms190476.aspx>, for example), or better defaults when they are not specified? Anyway, here’s what happens on SQL Server for some of the examples you provided. Adding 100: py> from decimal import Decimal as D py> x = D(10)**30 py> x == x + 100 # should be False True DECLARE @x DECIMAL(38,0) = '1' + REPLICATE(0, 30); IF @x = @x + 100 SELECT 'equal' AS adding_100ELSE SELECT 'not equal' AS adding_100 Gives “not equal” <http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1645/0>. Leaving out the precision when declaring @x (i.e. going with the default precision of 18 <https://msdn.microsoft.com/en-us/library/ms187746.aspx>) immediately yields an understandable data truncation error. Associativity: py> a = D(1)/17 py> b = D(5)/7 py> c = D(12)/13 py> (a + b) + c == a + (b+c) False DECLARE @a DECIMAL = 1.0/17;DECLARE @b DECIMAL = 5.0/7;DECLARE @c DECIMAL = 12.0/13; IF (@a + @b) + @c = @a + (@b + @c) SELECT 'equal' AS associativeELSE SELECT 'not equal' AS associative Gives “equal” <http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1656/0>. Distributivity: py> a = D(15)/2 py> b = D(15)/8 py> c = D(1)/14 py> a*(b+c) == a*b + a*c False DECLARE @a DECIMAL = 15.0/2;DECLARE @b DECIMAL = 15.0/8;DECLARE @c DECIMAL = 1.0/14; IF @a * (@b + @c) = @a*@b + @a*@c SELECT 'equal' AS distributiveELSE SELECT 'not equal' AS distributive Gives “equal” <http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1/1655/0>. I think some of the other decimal examples you provide, though definitely not 100% beginner friendly, are still way more human-friendly because they are explainable in terms of precision and scale, which we can understand more simply (“there aren’t enough decimal places to carry the result”) and which have parallels in other areas of life as Paul pointed out. - The sorts of errors we see with floats are not “madness”, but the completely logical consequences of what happens when you try to do arithmetic in anything less than the full mathematical abstraction. I don’t mean madness as in incorrect, I mean madness as in difficult to predict and difficult to understand. Your examples do show that it isn’t all roses and honey with decimals, but do you find it easier to understand explain all the weirdness of floats vs. decimals? Understanding float weirdness (and disclaimer: I don’t) seems to require understanding some hairy stuff, and even then it is not predictable because there are platform dependent issues. Understanding decimal “weirdness” seems to require only understanding precision and scale, and after that it is mostly predictable. Nick On Mon, Jun 1, 2015 at 11:19 AM Paul Moore <p.f.moore@gmail.com> wrote: On 1 June 2015 at 15:58, Steven D'Aprano <steve@pearwood.info> wrote:

On Jun 1, 2015, at 10:24, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
Obviously if you know the maximum precision needed before you start and explicitly set it to something big enough (or 7 places bigger than needed) you won't have any problem. Steven chose a low precision just to make the problems easy to see and understand; he could just as easily have constructed examples for a precision of 18. Unfortunately, even in cases where it is both possible and sufficiently efficient to work out and set the precision high enough to make all of your calculations exact, that's not something most people know how to do reliably. In the fully general case, it's as hard as calculating error propagation. As for the error: Python's decimal flags that too; the difference is that the flag is ignored by default. You can change it to warn or error instead. Maybe the solution is to make that easier--possibly just changing the docs. If you read the whole thing you will eventually learn that the default context ignores most such errors, but a one-liner gets you a different context that acts like SQL Server, but who reads the whole module docs (especially when they already believe they understand how decimal arithmetic works)? Maybe moving that up near the top would be useful?

On Mon, Jun 1, 2015 at 6:15 PM Andrew Barnert abarnert@yahoo.com <http://mailto:abarnert@yahoo.com> wrote: Obviously if you know the maximum precision needed before you start and
Perhaps Python needs better rules for how precision and scale are affected by calculations (here are SQL Server’s <https://msdn.microsoft.com/en-us/library/ms190476.aspx>, for example), or better defaults when they are not specified? It sounds like there are perhaps several improvements that can be made to how decimals are handled, documented, and configured by default, that could possibly address the majority of gotchas for the majority of people in a more user friendly way than can be accomplished with floats. For all the problems presented with decimals by Steven and others, I’m not seeing how overall they are supposed to be *worse* than the problems with floats. We can explain precision and scale to people when they are using decimals and give them a basic framework for understanding how they affect calculations, and we can pick sensible defaults so that people won’t hit nasty gotchas easily. So we have some leverage there for making the experience better for most people most of the time. What’s our leverage for improving the experience of working with floats? And is the result really something better than decimals? Nick

On Jun 1, 2015, at 15:53, Nicholas Chammas <nicholas.chammas@gmail.com> wrote:
I definitely agree that some edits to the decimal module docs, plus maybe a new HOWTO, and maybe some links to outside resources that explain things to people who are used to decimals in MSSQLServer or REXX or whatever, would be helpful. The question is, who has the sufficient knowledge, skill, and time/inclination to do it?
It sounds like there are perhaps several improvements that can be made to how decimals are handled, documented, and configured by default, that could possibly address the majority of gotchas for the majority of people in a more user friendly way than can be accomplished with floats.
For all the problems presented with decimals by Steven and others, I’m not seeing how overall they are supposed to be worse than the problems with floats.
They're not worse than the problems with floats, they're the same problems... But the _effect_ of those problems can be worse, because: * The magnitude of the rounding errors is larger. * People mistakenly think they understand everything relevant about decimals, and the naive tests they try work out, so the problems may blindside them. * Being much more detailed and configurable means the best solution may be harder to find. * There's a lot of correct but potentially-misleading information out there. For example, any StackOverflow answer that says "you can solve this particular problem by using Decimal instead of float" can be very easily misinterpreted as applying to a much wider range of problems than it actually does. * Sometimes performance matters. On the other hand, the effect can also be less bad, because: * Once people do finally understand a given problem, at least for many people and many problems, working out a solution is easier in decimal. For some uses (in particular, many financial uses, and some kinds of engineering problems), it's even trivial. * Being more detailed and more configurable means the best solution may be better than any solution involving float. I don't think there's any obvious answer to the tradeoff, short of making it easier for people to choose appropriately: a good HOWTO, decimal literals or Swift-style float-convertibles, making it easier to find/construct decimal64 or DECIMAL(18) or Money types, speeding up decimal (already done, but maybe more could be done), etc.

Nicholas, Your email client appears to not be quoting text you quote. It is a conventional to use a leading > for quoting, perhaps you could configure your mail program to do so? The good ones even have a "Paste As Quote" command. On with the substance of your post... On Mon, Jun 01, 2015 at 01:24:32PM -0400, Nicholas Chammas wrote:
Changing from binary floats to decimal floats by default is a big, backwards incompatible change. Even if it's a good idea, we're constrained by backwards compatibility: I would imagine we wouldn't want to even introduce this feature until the majority of people are using Python 3 rather than Python 2, and then we'd probably want to introduce it using a "from __future__ import decimal_floats" directive. So I would guess this couldn't happen until probably 2020 or so. But we could introduce a decimal literal, say 1.1d for Decimal("1.1"). The first prerequisite is that we have a fast Decimal implementation, which we now have. Next we would have to decide how the decimal literals would interact with the decimal module. Do we include full support of the entire range of decimal features, including globally configurable precision and other modes? Or just a subset? How will these decimals interact with other numeric types, like float and Fraction? At the moment, Decimal isn't even part of the numeric tower. There's a lot of ground to cover, it's not a trivial change, and will definitely need a PEP.
How many of your examples are inherent limitations of decimals vs. problems that can be improved upon?
In one sense, they are inherent limitations of floating point numbers regardless of base. Whether binary, decimal, hexadecimal as used in some IBM computers, or something else, you're going to see the same problems. Only the specific details will vary, e.g. 1/3 cannot be represented exactly in base 2 or base 10, but if you constructed a base 3 float, it would be exact. In another sense, Decimal has a big advantage that it is much more configurable than Python's floats. Decimal lets you configure the precision, rounding mode, error handling and more. That's not inherent to base 10 calculations, you can do exactly the same thing for binary floats too, but Python doesn't offer that feature for floats, only for Decimals. But no matter how you configure Decimal, all you can do is shift the gotchas around. The issue really is inherent to the nature of the problem, and you cannot defeat the universe. Regardless of what base you use, binary or decimal or something else, or how many digits precision, you're still trying to simulate an uncountably infinite continuous, infinitely divisible number line using a finite, discontinuous set of possible values. Something has to give. (For the record, when I say "uncountably infinite", I don't just mean "too many to count", it's a technical term. To oversimplify horribly, it means "larger than infinity" in some sense. It's off-topic for here, but if anyone is interested in learning more, you can email me off-list, or google for "countable vs uncountable infinity".) Basically, you're trying to squeeze an infinite number of real numbers into a finite amount of memory. It can't be done. Consequently, there will *always* be some calculations where the true value simply cannot be calculated and the answer you get is slightly too big or slightly too small. All the other floating point gotchas follow from that simple fact.
No. Change the precision and scale, and some *specific* problems goes away, but they reappear with other numbers. Besides, at the point that you're talking about setting the precision, we're really not talking about making things easy for beginners any more. And not all floating point issues are related to precision and scale in decimal. You cannot divide a cake into exactly three equal pieces in Decimal any more than you can divide a cake into exactly three equal pieces in binary. All you can hope for is to choose a precision were the rounding errors in one part of your calculation will be cancelled by the rounding errors in another part of your calculation. And that precision will be different for any two arbitrary calculations. -- Steve

On Tue, Jun 2, 2015 at 12:58 AM, Steven D'Aprano <steve@pearwood.info> wrote:
To be fair, you've actually destroyed precision so much that your numbers start out effectively equal:
They're not actually showing up as equal, but only because the precision setting doesn't (apparently) apply to the constructor. If adding zero to both sides of an equation makes it equal when it wasn't before, something seriously screwy is going on. (Actually, this behaviour of decimal.Decimal reminds me very much of REXX. Since there are literally no data types in REXX (everything is a string), the numeric precision setting ("NUMERIC DIGITS n") applies only to arithmetic operations, so the same thing of adding zero to both sides can happen.) So what you're really doing here is averaging 5E+1 and 5E+1, with an unsurprising result of... 5E+1. Your other example is more significant here, because your numbers actually do fit inside the precision limits - and then the end result slips outside the bounds. ChrisA

On Mon, Jun 1, 2015, at 10:58, Steven D'Aprano wrote:
But people have been learning about those rules, as apply to decimals, since they were small children. They know intuitively that 2/3 rounds to ...6667 at some point because they've done exactly that by hand. "user friendly" and "understandable to beginners" don't arise in a vacuum.

You are definitely right in "float vs. Decimal as representation of a real", but there is also a syntactical point that interpreting a float literal as Decimal rather than binary float is more natural since the literal itself *is* decimal. The there would be no counterpart of the following situation if the float literal was interpreted as Decimal rather than binary float.
Regards, Drekin

On 6/1/2015 2:02 AM, Jim Witschey wrote:
No, it is an idea presented here and other python lists. Example: just today, Laura Creighton wrote on python-list (Re: What is considered an "advanced" topic in Python?)
There is no PEP AFAIK because no one has bothered to write one sure to be rejected. -- Terry Jan Reedy

On Sun, May 31, 2015, at 22:25, u8y7541 The Awesome Person wrote:
Even though he's mistaken about the core premise, I do think there's a kernel of a good idea here - it would be nice to have a method (maybe as_integer_ratio, maybe with some parameter added, maybe a different method) to return with the smallest denominator that would result in exactly the original float if divided out, rather than merely the smallest power of two.

On Sun, May 31, 2015 at 8:14 PM, <random832@fastmail.us> wrote:
What is the computational complexity of a hypothetical float.as_simplest_integer_ratio() method? How hard that is to find is not obvious to me (probably it should be, but I'm not sure). -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 31May2015 20:27, David Mertz <mertz@gnosis.cx> wrote:
Probably the same as Euler's greatest common factor method. About log(n) I think. Take as_integer_ratio, find greatest common factor, divide both by that. Cheers, Cameron Simpson <cs@zip.com.au> In the desert, you can remember your name, 'cause there ain't no one for to give you no pain. - America

On Mon, Jun 1, 2015, at 00:37, Cameron Simpson wrote:
Er, no, because (6004799503160661, 18014398509481984) are already mutually prime, and we want (1, 3). This is a different problem from finding a reduced fraction. There are algorithms, I know, for constraining the denominator to a specific range (Fraction.limit_denominator does this), but that's not *quite* the same as finding the lowest one that will still convert exactly to the original float

On 01Jun2015 01:11, random832@fastmail.us <random832@fastmail.us> wrote:
Ah, you want the simplest fraction that _also_ gives the same float representation?
Hmm. Thanks for this clarification. Cheers, Cameron Simpson <cs@zip.com.au> The Design View editor of Visual InterDev 6.0 is currently incompatible with Compatibility Mode, and may not function correctly. - George Politis <george@research.canon.com.au>, 22apr1999, quoting http://msdn.microsoft.com/vstudio/technical/ie5.asp

On Sun, May 31, 2015 at 8:27 PM, David Mertz <mertz@gnosis.cx> wrote:
Here is a (barely tested) implementation based on the Stern-Brocot tree: def as_simple_integer_ratio(x): x = abs(float(x)) left = (int(x), 1) right = (1, 0) while True: mediant = (left[0] + right[0], left[1] + right[1]) test = mediant[0] / mediant[1] print(left, right, mediant, test) if test == x: return mediant elif test < x: left = mediant else: right = mediant print(as_simple_integer_ratio(41152/263)) The approximations are printed so you can watch the convergence. casevh

On Sun, May 31, 2015 at 8:14 PM, <random832@fastmail.us> wrote:
The gmpy2 library already supports such a method.
gmpy2 uses a version of the Stern-Brocot algorithm to find the shortest fraction that, when converted to a floating point value, will return the same value as the original floating point value. The implementation was originally done by Alex Martelli; I have just maintained it over the years. The algorithm is quite fast. If there is a consensus to add this method to Python, I would be willing to help implement it. casevh

On Wed, Jun 03, 2015 at 09:08:17AM -0700, drekin@gmail.com wrote:
Guido's time machine strikes again: py> Fraction(0.1).limit_denominator(1000) Fraction(1, 10)
Fraction.simple_from(Decimal(1) / Decimal(3)) Fraction(1, 3)
py> Fraction(Decimal(1)/Decimal(3)).limit_denominator(100) Fraction(1, 3) -- Steve
participants (25)
-
Adam Bartoš
-
Alexander Walters
-
Andrew Barnert
-
Cameron Simpson
-
Case Van Horsen
-
Chris Angelico
-
David Mertz
-
drekin@gmail.com
-
Guido van Rossum
-
Jim Witschey
-
Joonas Liik
-
M.-A. Lemburg
-
Mark Lawrence
-
Matthias Bussonnier
-
Nathaniel Smith
-
Nicholas Chammas
-
Nick Coghlan
-
Oscar Benjamin
-
Paul Moore
-
random832@fastmail.us
-
Stefan Behnel
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy
-
u8y7541 The Awesome Person