
Several people have pointed me at this interesting thread, and both Tim and Raymond have sent me summaries of their arguments. Thank you all! I see various things I have written have caused some confusion, for which I apologise. The 'right' answer might, in fact, depend somewhat on the programming language, as I'll try and explain below, but let me first try and summarize the background of the decimal specification which is on my website at: http://www2.hursley.ibm.com/decimal/#arithmetic Rexx ---- Back in 1979/80, I was writing the Rexx programming language, which has always had (only) decimal arithmetic. In 1980, it was used within IBM in over 40 countries, and had evolved a decimal arithmetic which worked quite well, but had some rather quirky arithmetic and rounding rules -- in particular, the result of an operation had a number of decimal places equal to the larger of the number of decimal places of its operands. Hence 1.23 + 1.27 gave 2.50 and 1.230000 + 1.27 gave 2.500000. This had some consequences that were quite predictable, but were unexpected by most people. For example, 1.2 x 1.2 gave 1.4, and you had to suffix a 0 to one of the operands (easy to do in Rexx) to get an exact result: 1.2 x 1.20 => 1.44. By 1981, much of the e-mail and feedback I was getting was related to various arithmetic quirks like this. My design strategy for the language was more-or-less to 'minimise e-mail' (I was getting 350+ every day, as there were no newsgroups or forums then) -- and it was clear that the way to minimise e-mail was to make the language work the way people expected (not just in arithmetic). I therefore 'did the research' on arithmetic to find out what it was that people expected (and it varies in some cases, around the world), and then changed the arithmetic to match that. The result was that e-mail on the subject dropped to almost nothing, and arithmetic in Rexx became a non-issue: it just did what people expected. It's strongest feature is, I think, that "what you see is what you've got" -- there are no hidden digits, for example. Indeed, in at least one Rexx interpreter the numbers are, literally, character strings, and arithmetic is done directly on those character strings (with no conversions or alternative internal representation). I therefore feel, quite strongly, that the value of a literal is, and must be, exactly what appears on the paper. And, in a language with no constructors (such as Rexx), and unlimited precision, this is straightforward. The assignment a = 1.10000001 is just that; there's no operation involved, and I would argue that anyone reading that and knowing the syntax of a Rexx assignment would expect the variable a to have the exact value of the literal (that is, "say a" would then display 1.10000001). The Rexx arithmetic does have the concept of 'context', which mirrors the way people do calculations on paper -- there are some implied rules (how many digits to work to, etc.) beyond the sum that is written down. This context, in Rexx, "is used to change the way in which arithmetic operations are carried out", and does not affect other operations (such as assignment). Java ---- So what should one do in an object-oriented language, where numbers are objects? Java is perhaps a good model, here. The Java BigDecimal class originally had only unlimited precision arithmetic (the results of multiplies just got longer and longer) and only division had a mechanism to limit (round) the result in some way, as it must. By 1997, it became obvious that the original BigDecimal, though elegant in its simplicity, was hard to use. We (IBM) proposed various improvements and built a prototype: http://www2.hursley.ibm.com/decimalj/ and this eventually became a formal Java Specification Request: http://jcp.org/aboutJava/communityprocess/review/jsr013/index.html which led to the extensive enhancements in BigDecimal that were shipped last year in Java 5: http://java.sun.com/j2se/1.5.0/docs/api/java/math/BigDecimal.html In summary, for each operation (such as a.add(b)) a new method was added which takes a context: a.add(b, context). The context supplies the rounding precision and rounding mode. Since the arguments to an operation can be of any length (precision), the rounding rule is simple: the operation is carried out as though to infinite precision and is then rounded (if necessary). This rule avoids double-rounding. Constructors were not a point of debate. The constructors in the original BigDecimal always gave an exact result (even when constructing from a binary double) so those were not going to change. We did, however, almost as an afterthought, add versions of the constructors that took a context argument. The model, therefore, is essentially the same as the Rexx one: what you see is what you get. In Java, the assignment: BigDecimal a = new BigDecimal("1.10000001"); ends up with a having an object with the value you see in the string, and for it to be otherwise one would have to write: BigDecimal a = new BigDecimal("1.10000001", context); which gives a very nice clue that something may happen to the value. This, to me, seems a clean design. So why does my specification appear to say something different? --------------------------------------------------------------- Both the languages described so far support arbitrary-length decimal numbers. Over the past five years, however, I have been concentrating more on fixed-length decimals, as in languages such as C# and as will be in C and C++ and in hardware. When the representation of a decimal number has a fixed length, then the nice clean model of a one-to-one mapping of a literal to the internal representation is no longer always possible. For example, the IEEE 754r proposed decimal32 format can represent a maximum of 7 decimal digits in the significand. Hence, the assignment: decimal32 d = 1.10000001; (in some hypothetical C-like language) cannot result in d having the value shown in the literal. This is the point where language history or precedent comes in: some languages might quietly round at this point, others might give a compile-time warning or error (my preference, at least for decimal types). Similar concerns apply when the conversion to internal form causes overflow or underflow. The wording in the specification was intended to allow for these kinds of behaviors, and to allow for explicit rounding, using a context, when a string is converted to some internal representation. It was not intended to restrict the behavior of (for example) the Java constructor: one might consider that constructor to be working with an implied context which has an infinite precision. In other words, the specification does not attempt to define where the context comes from, as this would seem to be language-dependent. In Java, any use of a programmer- supplied context is explicit and visible, and if none is supplied then the implied context has UNLIMITED precision. In Rexx, the context is always implicit -- but there is no 'conversion from string to number' because numbers _are_ strings. So what should Python do? ------------------------- Since your Decimal class has the ability to preserve the value of a literal exactly, my recommendation is that that should be the default behavior. Changing the value supplied as a literal without some explicit syntax is likely to surprise, given the knowledge that there are no length restrictions in the class. My view is that only in the case of a fixed-length destination or an explicit context might such a rounding be appropriate. Given that Python has the concept of an implicit decimal context, I can see why Tim can argue that the implicit context is the context which applies for a constructor. However, perhaps you can define that the implicit context 'only applies to arithmetic operations', or some such definition, much as in Rexx? And I should clarify my specification to make it clear that preserving the value, if possible, is preferable to rounding -- any suggestions for wording? Mike Cowlishaw [I'll try and follow this thread in the mailing list for a while, but I am flying to the USA on Monday so my e-mail access will be erratic for the next week or so.]