PEP: 2XX Title: Adding a Decimal type to Python Version: $Revision:$ Author: mclay@nist.gov mclay@nist.gov Status: Draft Type: ?? Created: 25-Jul-2001 Python-Version: 2.2
Introduction
This PEP describes the addition of a decimal number type to Python.
Rationale
The original Python numerical model included int, float, and long. By popular request the imaginary type was added to improve support for engineering and scientific applications. The addition of a decimal number type to Python will improve support for business applications as well as improve the utility of Python a teaching language.
The number types currently used in Python are encoded as base two binary numbers. The base 2 arithmetic used by binary numbers closely approximates the decimal number system and for many applications the differences in the calculations are unimportant. The decimal number type encodes numbers as decimal digits and use base 10 arithmetic. This is the number system taught to the general public and it is the system used by businesses when making financial calculations.
For financial and accounting applications the difference between binary and decimal types is significant. Consequently the computer languages used for business application development, such as COBOL, use decimal types.
The decimal number type meets the expectations of non-computer scientists when making calculations. For these users the rounding errors that occur when using binary numbers is a source of confusion and irritation.
Implementation
The tokenizer will be modified to recognized number literals with a 'd' suffix and a decimal() function will be added to __builtins__. A decimal number can be used to represent integers and floating point numbers and decimal numbers can also be displayed using scientific notation. Examples of decimal numbers include:
1234d -1234d 1234.56d -1234.56d 1234.56e2d -1234.56e-2d
The type returned by either a decimal floating point or a decimal integer is the same:
>>> type(12.2d) <type 'decimal'> >>> type(12d) <type 'decimal'> >>> type(-12d+12d) <type 'decimal'> >>> type(12d+12.0d) <type 'decimal'>
This proposal will also add an optional 'b' suffix to the representation of binary float type literals and binary int type literals.
>>> float(12b) 12.0 >>> type(12.2b) <type 'float'> >>> type(float(12b)) <type 'float'> >>> type(12b) <type 'int'>
The decimal() conversion function added to __builtins__ will support conversions of strings, and binary types to decimal.
>>> type(decimal("12d")) <type 'decimal'> >>> type(decimal("12")) <type 'decimal'> >>> type(decimal(12b)) <type 'decimal'> >>> type(decimal(12.0b)) <type 'decimal'> >>> type(decimal(123456789123L)) <type 'decimal'>
The conversion functions int() and float() in the __builtin__ module will support conversion of decimal numbers to the binary number types.
>>> type(int(12d)) <type 'int'> >>> type(float(12.0d)) <type 'float'>
Expressions that mix integers with decimals will automatically convert the integer to decimal and the result will be a decimal number.
>>> type(12d + 4b) <type 'decimal'> >>> type(12b + 4d) <type 'decimal'> >>> type(12d + len('abc')) <type 'decimal'> >>> 3d/4b 0.75
Expressions that mix binary floats with decimals introduce the possibility of unexpected results because the two number types use different internal representations for the same numerical value. The severity of this problem is dependent on the application domain. For applications that normally use binary numbers the error may not be important and the conversion should be done silently. For newbie programmers a warning should be issued so the newbie will be able to locate the source of a discrepancy between the expected results and the results that were achieved. For financial applications the mixing of floating point with binary numbers should raise an exception.
To accommodate the three possible usage models the python interpreter command line options will be used to set the level for warning and error messages. The three levels are:
promiscuous mode, -f or --promiscuous safe mode -s or --save pedantic mode -p or --pedantic
The default setting will be set to the safe setting. In safe mode mixing decimal and binary floats in a calculation will trigger a warning message.
>>> type(12.3d + 12.2b) Warning: the calculation mixes decimal numbers with binary floats <type 'decimal'>
In promiscuous mode warnings will be turned off.
>>> type(12.3d + 12.2b) <type 'decimal'>
In pedantic mode warning from safe mode will be turned into exceptions.
>>> type(12.3d + 12.2b) Traceback (innermost last): File "<stdin>", line 1, in ? TypeError: the calculation mixes decimal numbers with binary floats
Semantics of Decimal Numbers
??
Michael's PEP touches upon the one difficult area of decimal semantics: what to do when a decimal and a binary float meet?
We discussed this briefly over lunch here and Tim pointed out that the default should probably be an error: code expecting to work with exact decimals should be allowed to continue after contamination with an inexact binary float.
But in other contexts it would make more sense to turn mixed operands into inexact, like what currently happens when int/long meets float.
In the IBM model that Aahz is implementing, decimal numbers are not necessarily exact, but (if I understand correctly) you can set a context flag that causes an exception to be raised when the result of an operation on two exact inputs is inexact. This can happen when e.g. a multiplication result exceeds the number of significant digits specified in the context -- then truncation is applied like for binary floats.
Could the numeric tower look like this?
int < long < decimal < rational < float < complex ******************************* *************** exact inexact
A numeric context could contain a flag that decides what happens when exact and inexact are mixed.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
Guido van Rossum wrote:
In the IBM model that Aahz is implementing, decimal numbers are not necessarily exact, but (if I understand correctly) you can set a context flag that causes an exception to be raised when the result of an operation on two exact inputs is inexact. This can happen when e.g. a multiplication result exceeds the number of significant digits specified in the context -- then truncation is applied like for binary floats.
Rounding, actually (unless the specified context (which is not the default) request truncation), but yes.
Michael McLay wrote:
PEP: 2XX Title: Adding a Decimal type to Python Version: $Revision:$ Author: mclay@nist.gov mclay@nist.gov Status: Draft Type: ?? Created: 25-Jul-2001 Python-Version: 2.2
Introduction
This PEP describes the addition of a decimal number type to Python. ...
Implementation
The tokenizer will be modified to recognized number literals with a 'd' suffix and a decimal() function will be added to __builtins__.
How will you be able to define the precision of decimals ? Implicit by providing a decimal string with enough 0s to let the parser deduce the precision ? Explicit like so: decimal(12, 5) ?
Also, what happens to the precision of the decimal object resulting from numeric operations ?
A decimal number can be used to represent integers and floating point numbers and decimal numbers can also be displayed using scientific notation. Examples of decimal numbers include:
...
This proposal will also add an optional 'b' suffix to the representation of binary float type literals and binary int type literals.
Hmm, I don't quite grasp the need for the 'b'... numbers without any modifier will work the same way as they do now, right ?
... Expressions that mix binary floats with decimals introduce the possibility of unexpected results because the two number types use different internal representations for the same numerical value.
I'd rather have this explicit in the sense that you define which assumptions will be made and what issues arise (rounding, truncation, loss of precision, etc.).
The severity of this problem is dependent on the application domain. For applications that normally use binary numbers the error may not be important and the conversion should be done silently. For newbie programmers a warning should be issued so the newbie will be able to locate the source of a discrepancy between the expected results and the results that were achieved. For financial applications the mixing of floating point with binary numbers should raise an exception. To accommodate the three possible usage models the python interpreter command line options will be used to set the level for warning and error messages. The three levels are: promiscuous mode, -f or --promiscuous safe mode -s or --save pedantic mode -p or --pedantic
How about a generic option:
--numerics:[loose|safe|pedantic] or -n:[l|s|p]
The default setting will be set to the safe setting. In safe mode mixing decimal and binary floats in a calculation will trigger a warning message. >>> type(12.3d + 12.2b) Warning: the calculation mixes decimal numbers with binary floats <type 'decimal'> In promiscuous mode warnings will be turned off. >>> type(12.3d + 12.2b) <type 'decimal'> In pedantic mode warning from safe mode will be turned into exceptions. >>> type(12.3d + 12.2b) Traceback (innermost last): File "<stdin>", line 1, in ? TypeError: the calculation mixes decimal numbers with binary floats
Semantics of Decimal Numbers
??
On Tuesday 31 July 2001 04:30 am, M.-A. Lemburg wrote:
How will you be able to define the precision of decimals ? Implicit by providing a decimal string with enough 0s to let the parser deduce the precision ? Explicit like so: decimal(12, 5) ?
Would the following work? For literal type definitions the precision would be implicit. For values set using the decimal() function the definition would be implicit unless an explicit precision definition is set. The following would all define the same value and precision.
3.40d decimal("3.40") decimal(3.4, 2)
Those were easy. How would the following be interpreted?
decimal 3.404, 2) decimal 3.405, 2) decimal(3.39999, 2)
Also, what happens to the precision of the decimal object resulting from numeric operations ?
Good question. I'm not the right person to answer this, but here's is a first stab at what I would expect.
For addition, subtraction, and multiplication the results would be exact with no rounding of the results. Calculations that include division the number of digits in a non-terminating result will have to be explicitly set. Would it make sense for this to be definedby the numbers used in the calculation? Could this be set in the module or could it be global for the application?
What do you suggestion?
A decimal number can be used to represent integers and floating point numbers and decimal numbers can also be displayed using scientific notation. Examples of decimal numbers include: ... This proposal will also add an optional 'b' suffix to the representation of binary float type literals and binary int type literals.
Hmm, I don't quite grasp the need for the 'b'... numbers without any modifier will work the same way as they do now, right ?
I made a change to the parsenumber() function in compile.c so that the type of the number is determined by the suffix attached to the number. To retain backward compatibility the tokenizer automatically attaches the 'b' suffix to float and int types if they do not have a suffix in the literal definition.
My original PEP included the definition of a .dp and a dpython mode for the interpreter in which the default number type is decimal instead of binary. When the mode is switch the language becomes easier to use for developing applications that use decimal numbers.
Expressions that mix binary floats with decimals introduce the possibility of unexpected results because the two number types use different internal representations for the same numerical value.
I'd rather have this explicit in the sense that you define which assumptions will be made and what issues arise (rounding, truncation, loss of precision, etc.).
Can you give an example of how this might be implemented.
To accommodate the three possible usage models the python interpreter command line options will be used to set the level for warning and error messages. The three levels are: promiscuous mode, -f or --promiscuous safe mode -s or --save pedantic mode -p or --pedantic
How about a generic option:
--numerics:[loose|safe|pedantic] or -n:[l|s|p]
Thanks for the suggestion. I"ll change it.
Michael McLay wrote:
On Tuesday 31 July 2001 04:30 am, M.-A. Lemburg wrote:
How will you be able to define the precision of decimals ? Implicit by providing a decimal string with enough 0s to let the parser deduce the precision ? Explicit like so: decimal(12, 5) ?
Would the following work? For literal type definitions the precision would be implicit. For values set using the decimal() function the definition would be implicit unless an explicit precision definition is set. The following would all define the same value and precision.
3.40d decimal("3.40") decimal(3.4, 2)
Those were easy. How would the following be interpreted?
decimal 3.404, 2) decimal 3.405, 2) decimal(3.39999, 2)
I'd suggest to follow the rules for the SQL definitions of DECIMAL(,).
Also, what happens to the precision of the decimal object resulting from numeric operations ?
Good question. I'm not the right person to answer this, but here's is a first stab at what I would expect.
For addition, subtraction, and multiplication the results would be exact with no rounding of the results. Calculations that include division the number of digits in a non-terminating result will have to be explicitly set. Would it make sense for this to be definedby the numbers used in the calculation? Could this be set in the module or could it be global for the application?
What do you suggestion?
Well, there are several options. I support that the IBM paper on decimal types has good hints as to what the type should do. Again, SQL is probably a good source for inspiration too, since it deals with decimals a lot.
A decimal number can be used to represent integers and floating point numbers and decimal numbers can also be displayed using scientific notation. Examples of decimal numbers include: ... This proposal will also add an optional 'b' suffix to the representation of binary float type literals and binary int type literals.
Hmm, I don't quite grasp the need for the 'b'... numbers without any modifier will work the same way as they do now, right ?
I made a change to the parsenumber() function in compile.c so that the type of the number is determined by the suffix attached to the number. To retain backward compatibility the tokenizer automatically attaches the 'b' suffix to float and int types if they do not have a suffix in the literal definition.
My original PEP included the definition of a .dp and a dpython mode for the interpreter in which the default number type is decimal instead of binary. When the mode is switch the language becomes easier to use for developing applications that use decimal numbers.
I see, the small 'b' still looks funny to me though. Wouldn't 1.23f and 25i be more intuitive ?
Expressions that mix binary floats with decimals introduce the possibility of unexpected results because the two number types use different internal representations for the same numerical value.
I'd rather have this explicit in the sense that you define which assumptions will be made and what issues arise (rounding, truncation, loss of precision, etc.).
Can you give an example of how this might be implemented.
You would typically first coerce the types to the "larger" type, e.g. float + decimal -> float + float -> float, so you'd only have to document how the conversion is done and which accuracy to expect.
To accommodate the three possible usage models the python interpreter command line options will be used to set the level for warning and error messages. The three levels are: promiscuous mode, -f or --promiscuous safe mode -s or --save pedantic mode -p or --pedantic
How about a generic option:
--numerics:[loose|safe|pedantic] or -n:[l|s|p]
Thanks for the suggestion. I"ll change it.
Great.
On Tuesday 31 July 2001 12:36 pm, M.-A. Lemburg wrote:
I'd suggest to follow the rules for the SQL definitions of DECIMAL(,).
Well, there are several options. I support that the IBM paper on decimal types has good hints as to what the type should do. Again, SQL is probably a good source for inspiration too, since it deals with decimals a lot.
Ok, I know about the IBM paper. is there online document on the SQL semantics that can be referenced in the PEP?
I see, the small 'b' still looks funny to me though. Wouldn't 1.23f and 25i be more intuitive ?
I originally used 'f' for both the integer and float. The use of 'b' was suggested by Guido. There were two reasons not to use 'i' for integers. The first has to do with how the tokenizer works. It doesn't distringuish between float and int when the token string is passed to parsenumber(). Both float and int are processed by the same function. I could have got around this problem by having the switch statement in parsenumber recognize both 'i' and 'f', but there is another problem with using 'i'. The 25i would be confusing for someone if they was trying to use an imaginary numbers If they accidentally typed 25i instead of 25j they would get an integer instead of an imaginary number. The error might not be detected since 3.0 + 4i would evaluate properly.
I'd rather have this explicit in the sense that you define which assumptions will be made and what issues arise (rounding, truncation, loss of precision, etc.).
Can you give an example of how this might be implemented.
You would typically first coerce the types to the "larger" type, e.g. float + decimal -> float + float -> float, so you'd only have to document how the conversion is done and which accuracy to expect.
I would be concerned about the float + decimal automatically generating a float. Would it generate an error message if the pedantic flag was set? Would it generate a warning in safe mode?
Also, why do you consider a float to be a "larger" value type than decimal? Do you mean that a float is less precise?
Also, why do you consider a float to be a "larger" value type than decimal? Do you mean that a float is less precise?
(Warning: I think the following is a sound model, but I'm still practicing how to explain it right.)
I have this ordering of the types in mind:
int/long < decimal < rational < float < complex ---------------------------/ -------------/ exact inexact
This is different from the Scheme numeric "tower" -- I no longer agree with the Scheme model any more.
The ordering is only to determine what happens on mixed arithmetic: the result has the rightmost type in the diagram (or a type further on the right in some cases).
The ints are a subset of the decimal numbers, and the decimal numbers (in this view) are a subset of the rational numbers. Ints and decimals aren't closed under division -- the result of division on these (in general) is a rational. While the exact values of floats are a subset of the rationals, the inexactness property (which I give all floats) makes that each float stands for an infinite set of numbers *including* the exact value. When a binary operation involves an exact and an inexact operand, the result is inexact.
Tim's "numeric context" contains a bunch of flags controlling detailed behavior of numeric operations. It could specify that mixing exact and inexact numbers is illegal, and that would be Michael's pedantic mode. It could also specify warnings. (I would never call a mode that issues warnings "safe" :-)
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
Guido van Rossum wrote:
int/long < decimal < rational < float < complex ---------------------------/ -------------/ exact inexact
Note that in Cowlishaw's implementation of decimal numbers, decimals are *not* exact. Truncation (rounding), overflow, and underflow errors can occur under addition, subtraction, and multiplication. It's trivial to set them to be unbounded, but then Cowlishaw provides no mechanism for determining the truncation of division.
aahz@rahul.net (Aahz Maruch):
Truncation (rounding), overflow, and underflow errors can occur under addition, subtraction, and multiplication. It's trivial to set them to be unbounded, but then Cowlishaw provides no mechanism for determining the truncation of division.
If you allow for the representation of repeating parts in your unbounded decimals, they could be closed under division. (I think -- does the division of one repeating decimal by another always lead to a third repeating decimal? Yes, it must, because every rational can be expressed as a repeating decimal and vice versa, IIRC. Hmmm, that means we'd just be implementing rationals another way...)
Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
aahz@rahul.net (Aahz Maruch):
Truncation (rounding), overflow, and underflow errors can occur under addition, subtraction, and multiplication. It's trivial to set them to be unbounded, but then Cowlishaw provides no mechanism for determining the truncation of division.
If you allow for the representation of repeating parts in your unbounded decimals, they could be closed under division. (I think -- does the division of one repeating decimal by another always lead to a third repeating decimal? Yes, it must, because every rational can be expressed as a repeating decimal and vice versa, IIRC. Hmmm, that means we'd just be implementing rationals another way...)
<shudder> Tell ya what, *you* write the algorithm and I'll think about it sometime in the next five years. ;-)
Aahz Maruch wrote:
Greg Ewing wrote:
aahz@rahul.net (Aahz Maruch):
Truncation (rounding), overflow, and underflow errors can occur under addition, subtraction, and multiplication. It's trivial to set them to be unbounded, but then Cowlishaw provides no mechanism for determining the truncation of division.
If you allow for the representation of repeating parts in your unbounded decimals, they could be closed under division. (I think -- does the division of one repeating decimal by another always lead to a third repeating decimal? Yes, it must, because every rational can be expressed as a repeating decimal and vice versa, IIRC. Hmmm, that means we'd just be implementing rationals another way...)
<shudder> Tell ya what, *you* write the algorithm and I'll think about it sometime in the next five years. ;-)
Decimals are just a different way of displaying Rationals and depending on their preset precision show a different behaviour in numeric operations.
Decimals are just a different way of displaying Rationals and depending on their preset precision show a different behaviour in numeric operations.
Not necessarily. If decimals are used for (in-memory) representation, they can't represent all rationals.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
Guido van Rossum wrote:
Decimals are just a different way of displaying Rationals and depending on their preset precision show a different behaviour in numeric operations.
Not necessarily. If decimals are used for (in-memory) representation, they can't represent all rationals.
I know, that's what I wanted to say with "depending on their preset precision"; if you have a decimal with precision 2 then 1/3 will come out as "0.33" -- my point was that by implementing a Rational type instead of a decimal type we can have all semantics of decimals by simply subclassing the Rational type.
This setup will be more flexible than choosing one of the many different variants of "how to add two decimal numbers" (e.g. calculator style, with hidden extra digits, with truncation, mathematical rounding, (one of the various) financial rounding modes, etc.).
[Marc]
Decimals are just a different way of displaying Rationals and depending on their preset precision show a different behaviour in numeric operations.
[Guido]
Not necessarily. If decimals are used for (in-memory) representation, they can't represent all rationals.
Different definitions of decimals? Marc was thinking repeating decimals, if I understood him correctly.
Guido van Rossum wrote:
int/long < decimal < rational < float < complex ---------------------------/ -------------/ exact inexact
Note that in Cowlishaw's implementation of decimal numbers, decimals are *not* exact. Truncation (rounding), overflow, and underflow errors can occur under addition, subtraction, and multiplication. It's trivial to set them to be unbounded, but then Cowlishaw provides no mechanism for determining the truncation of division.
I know all that. Cowlishaw's numbers are a replacement for float (and for the real/imag parts of complex). I was toying with a *different* idea, more appropriate perhaps for accountants, that guarantees exact results for decimal numbers, giving decimal results when possible, rational otherwise. This is based on my guess about Michael's target audience. Michael makes a big deal of not allowing "decimalness" to be contaminated by mixing with floats. Possible this could be implemented using a subclass of your decimals that uses unbounded precision but overrides division to produce a rational number if needed.
But it's possible that Michael's audience would in fact be very happy with your/Cowlishaw's decimal numbers.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
Guido van Rossum wrote:
Guido van Rossum wrote:
int/long < decimal < rational < float < complex ---------------------------/ -------------/ exact inexact
Note that in Cowlishaw's implementation of decimal numbers, decimals are *not* exact. Truncation (rounding), overflow, and underflow errors can occur under addition, subtraction, and multiplication. It's trivial to set them to be unbounded, but then Cowlishaw provides no mechanism for determining the truncation of division.
I know all that. Cowlishaw's numbers are a replacement for float (and for the real/imag parts of complex). I was toying with a *different* idea, more appropriate perhaps for accountants, that guarantees exact results for decimal numbers, giving decimal results when possible, rational otherwise. This is based on my guess about Michael's target audience. Michael makes a big deal of not allowing "decimalness" to be contaminated by mixing with floats. Possible this could be implemented using a subclass of your decimals that uses unbounded precision but overrides division to produce a rational number if needed.
But it's possible that Michael's audience would in fact be very happy with your/Cowlishaw's decimal numbers.
I believe that Rationals should form the basis for Decimal types -- the different possible semantics can then be added either by subclassing Rationals or by providing some context or specialized methods for different operations, e.g. for special rounding requirements like you have in financial applications.
Note that I implemented mxNumber for just this reason (the float and long integer support is merely a side effect of GMP providing it ;-).
[Guido]
.... I was toying with a *different* idea, more appropriate perhaps for accountants, that guarantees exact results for decimal numbers, giving decimal results when possible, rational otherwise.
My experience with accountants (~5 yrs in another life) is that they're perfectly happy with fixed point decimal. They may make you do things to the 1/100 of a penny & then round in funny ways that favor their bosses, but they want every calculation to come out the same as on their calculator.
- Gordon
On Tuesday 31 July 2001 07:26 pm, Guido van Rossum wrote:
Also, why do you consider a float to be a "larger" value type than decimal? Do you mean that a float is less precise?
(Warning: I think the following is a sound model, but I'm still practicing how to explain it right.)
I have this ordering of the types in mind:
int/long < decimal < rational < float < complex ---------------------------/ -------------/ exact inexact
This is different from the Scheme numeric "tower" -- I no longer agree with the Scheme model any more.
The ordering is only to determine what happens on mixed arithmetic: the result has the rightmost type in the diagram (or a type further on the right in some cases).
The ints are a subset of the decimal numbers, and the decimal numbers (in this view) are a subset of the rational numbers. Ints and decimals aren't closed under division -- the result of division on these (in general) is a rational. While the exact values of floats are a subset of the rationals, the inexactness property (which I give all floats) makes that each float stands for an infinite set of numbers *including* the exact value. When a binary operation involves an exact and an inexact operand, the result is inexact.
Hmm, am I understanding you explaination?
Here is a rational expression:
9/4 * 4/3 = 3
With floats this ends up being close, but with rounding errors.
2.25*1.333333
2.9999992500000001
If this is expressed as a product of 2.25b * 1.333333d the result would be an inexact value. A binary number would be returned, instead of the decimal number 3.
Tim's "numeric context" contains a bunch of flags controlling detailed behavior of numeric operations. It could specify that mixing exact and inexact numbers is illegal, and that would be Michael's pedantic mode. It could also specify warnings. (I would never call a mode that issues warnings "safe" :-)
Where is Tim's "numeric context" located?
M.-A. Lemburg" suggested looking at the SQL specification for Decimal datatypes. A decimal type is also defined as a type in XML Schema. Since this is an XML datatype there isn't a definition for how these numbers are created.
NOTE: All ·minimally conforming· processors ·must· support decimal numbers with a minimum of 18 decimal digits (i.e., with a ·totalDigits· of 18). However, ·minimally conforming· processors ·may· set an application-defined limit on the maximum number of decimal digits they are prepared to support, in which case that application-defined maximum number ·must· be clearly documented. - http://www.w3.org/TR/xmlschema-2/#decimal
Hmm, am I understanding you explaination?
Here is a rational expression:
9/4 * 4/3 = 3
With floats this ends up being close, but with rounding errors.
2.25*1.333333
2.9999992500000001
If this is expressed as a product of 2.25b * 1.333333d the result would be an inexact value. A binary number would be returned, instead of the decimal number 3.
Correct.
Where is Tim's "numeric context" located?
In his mind. :-)
I believe it is typically global per thread, but that's up to the langage binding. A Java binding for Cowlishaw's decimals apparently requires passing in a context as a third argument on each operation.
M.-A. Lemburg" suggested looking at the SQL specification for Decimal datatypes. A decimal type is also defined as a type in XML Schema. Since this is an XML datatype there isn't a definition for how these numbers are created.
Do these say anything about semantics under numeric operations? That would seem to be outside the realm of XML and possibly even outside SQL. So I'm not sure how these help.
NOTE: All ·minimally conforming· processors ·must· support decimal numbers with a minimum of 18 decimal digits (i.e., with a ·totalDigits· of 18). However, ·minimally conforming· processors ·may· set an application-defined limit on the maximum number of decimal digits they are prepared to support, in which case that application-defined maximum number ·must· be clearly documented.
I followed the URL and found only external representation issues, nothing that can help us.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
On Wednesday 01 August 2001 10:55 am, Guido van Rossum wrote:
I believe it is typically global per thread, but that's up to the langage binding. A Java binding for Cowlishaw's decimals apparently requires passing in a context as a third argument on each operation.
I'm trying to understand how a decimal number context would work. Is the context a variable and/or flag that defines the rounding rules and precision of a number when it is used in a calculation? How is it associated with a number or a calculation? The "global per thread" description seems to associate the context with threads. Can the context be altered inside the thread? Is it possible to change the context at different levels in a stackframe?
I would assume there is a default context will be used until the context is changed. If this is the case I would expect a default context would be defined at startup.
Would it make sense to have a simple decimal type with no features that can be modified (a fixed context)? This simple type could be extended by deriving a new numerical type from the base decimal type. This base decimal type would be targeted at the newbie user. It would have no surprises. It would have a default precision of 18 and the rules for rounding would emulate the typical hand held calculator. Accountants who need special rounding rules would use a derived type that allowed the default rules to be overridden.
It would be possible to round numbers of the simple based type, but it would be an explicit step to remove insignificant digits. An accounting decimal type might automatically round calculations to the smallest denomination. For instance, an accounting context might have automatically managed the final rounding in the following calculation:
p>>> quantity = 6
tax = .06 price = 2.99 total = price * quantity * (1 + tax) total
19.0164
round(total,2)
19.02
M.-A. Lemburg" suggested looking at the SQL specification for Decimal datatypes. A decimal type is also defined as a type in XML Schema. Since this is an XML datatype there isn't a definition for how these numbers are created.
Do these say anything about semantics under numeric operations? That would seem to be outside the realm of XML and possibly even outside SQL. So I'm not sure how these help.
You are correct that it doesn't deal with numeric operations. It does define a minimum precision requirement. I am only referencing it here because it is another instance where having a decimal type in Python would be useful and because they have set a minimum requirement. Setting this minmum as a default behavior would probably make newbies comfortable with the language.
NOTE: All ·minimally conforming· processors ·must· support decimal numbers with a minimum of 18 decimal digits (i.e., with a ·totalDigits· of 18). However, ·minimally conforming· processors ·may· set an application-defined limit on the maximum number of decimal digits they are prepared to support, in which case that application-defined maximum number ·must· be clearly documented.
I'm trying to understand how a decimal number context would work. Is the context a variable and/or flag that defines the rounding rules and precision of a number when it is used in a calculation?
Yes -- I believe advanced HP calculators have such functionality. So does IEEE 754 for binary floating point.
How is it associated with a number or a calculation?
As I wrote, this depends on the language binding. I believe it's associated with a calculation, not with a number.
The "global per thread" description seems to associate the context with threads. Can the context be altered inside the thread? Is it possible to change the context at different levels in a stackframe?
Again, this would depend on the language. I believe typically you can change it but it's not stacked (you'd have to do that yourself).
I would assume there is a default context will be used until the context is changed. If this is the case I would expect a default context would be defined at startup.
Me too.
Would it make sense to have a simple decimal type with no features that can be modified (a fixed context)?
That's like providing IEEE 754 floating point without the controls. That's what C does, but at times it's painful. For MOST users this would be fine, but for advanced use you need the control, and claiming "IEEE 754 std" is unfair without the controls.
This simple type could be extended by deriving a new numerical type from the base decimal type. This base decimal type would be targeted at the newbie user. It would have no surprises.
It's hard to avoid surprises of the kind (1/3)*3 != 1. My calculator gives 0.99999999, but it's still a surprise. On the other hand for someone who thinks they know how a calculator does it, returning 1 would be the surprise!
What kind of surprises do you specifically want to avoid?
It would have a default precision of 18 and the rules for rounding would emulate the typical hand held calculator. Accountants who need special rounding rules would use a derived type that allowed the default rules to be overridden.
It would be possible to round numbers of the simple based type, but it would be an explicit step to remove insignificant digits. An accounting decimal type might automatically round calculations to the smallest denomination. For instance, an accounting context might have automatically managed the final rounding in the following calculation:
p>>> quantity = 6
tax = .06 price = 2.99 total = price * quantity * (1 + tax) total
19.0164
round(total,2)
19.02
Looks good to me. This would be a nice goal to strive for.
M.-A. Lemburg" suggested looking at the SQL specification for Decimal datatypes. A decimal type is also defined as a type in XML Schema. Since this is an XML datatype there isn't a definition for how these numbers are created.
Do these say anything about semantics under numeric operations? That would seem to be outside the realm of XML and possibly even outside SQL. So I'm not sure how these help.
You are correct that it doesn't deal with numeric operations. It does define a minimum precision requirement. I am only referencing it here because it is another instance where having a decimal type in Python would be useful and because they have set a minimum requirement. Setting this minmum as a default behavior would probably make newbies comfortable with the language.
Good point.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
I'm trying to understand how a decimal number context would work. Is the context a variable and/or flag that defines the rounding rules and precision of a number when it is used in a calculation?
Yes -- I believe advanced HP calculators have such functionality. So does IEEE 754 for binary floating point.
I forgot: also which exceptional conditions raise exceptions, and a bunch of (resettable?) state flags that tell you whether the last calculation did certain things like lose precision or overflow etc.
I know nothing about this stuff, but Tim talks about it a lot and I seem to learn by osmosis. :-)
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
Guido van Rossum wrote:
Also, why do you consider a float to be a "larger" value type than decimal? Do you mean that a float is less precise?
(Warning: I think the following is a sound model, but I'm still practicing how to explain it right.)
I have this ordering of the types in mind:
int/long < decimal < rational < float < complex ---------------------------/ -------------/ exact inexact
This is different from the Scheme numeric "tower" -- I no longer agree with the Scheme model any more.
The ordering is only to determine what happens on mixed arithmetic: the result has the rightmost type in the diagram (or a type further on the right in some cases).
Interesting. Here's what I use in mxNumber:
mx.Number.Float ^ | --------> Python float | ^ | | | mx.Number.Rational | ^ | | Python long --> mx.Number.Integer ^ ^ | | -------- Python integer
The ints are a subset of the decimal numbers, and the decimal numbers (in this view) are a subset of the rational numbers. Ints and decimals aren't closed under division -- the result of division on these (in general) is a rational. While the exact values of floats are a subset of the rationals, the inexactness property (which I give all floats) makes that each float stands for an infinite set of numbers *including* the exact value. When a binary operation involves an exact and an inexact operand, the result is inexact.
Tim's "numeric context" contains a bunch of flags controlling detailed behavior of numeric operations. It could specify that mixing exact and inexact numbers is illegal, and that would be Michael's pedantic mode. It could also specify warnings. (I would never call a mode that issues warnings "safe" :-)
Could you perhaps write this coercion scheme up as informational PEP ? I think it would help a lot as reference to what Python should do and serve well for numeric extension writers as basis for their coercion decisions.
Could you perhaps write this coercion scheme up as informational PEP ? I think it would help a lot as reference to what Python should do and serve well for numeric extension writers as basis for their coercion decisions.
I'm not sure which coercion scheme you are referring to. I'm also overwhelmed with other stuff so I think I'll pass, but I'll gladly help steer a draft by someone else in the right direction.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)
Michael McLay wrote:
On Tuesday 31 July 2001 12:36 pm, M.-A. Lemburg wrote:
I'd suggest to follow the rules for the SQL definitions of DECIMAL(,).
Well, there are several options. I support that the IBM paper on decimal types has good hints as to what the type should do. Again, SQL is probably a good source for inspiration too, since it deals with decimals a lot.
Ok, I know about the IBM paper. is there online document on the SQL semantics that can be referenced in the PEP?
The spec is not available online since it is an ISO paper (you pay per page), but I did find a review draft dated July 1992 for SQL-92:
http://www.contrib.andrew.cmu.edu/%7Eshadow/sql/sql1992.txt
which has all the relevant information in text format (section 4.4.).
I see, the small 'b' still looks funny to me though. Wouldn't 1.23f and 25i be more intuitive ?
I originally used 'f' for both the integer and float. The use of 'b' was suggested by Guido. There were two reasons not to use 'i' for integers. The first has to do with how the tokenizer works. It doesn't distringuish between float and int when the token string is passed to parsenumber(). Both float and int are processed by the same function. I could have got around this problem by having the switch statement in parsenumber recognize both 'i' and 'f', but there is another problem with using 'i'. The 25i would be confusing for someone if they was trying to use an imaginary numbers If they accidentally typed 25i instead of 25j they would get an integer instead of an imaginary number. The error might not be detected since 3.0 + 4i would evaluate properly.
Well, if we manage to have a registry of some sort for these number literal modifiers, I think that the issues in which code to choose are secondary.
I'd rather have this explicit in the sense that you define which assumptions will be made and what issues arise (rounding, truncation, loss of precision, etc.).
Can you give an example of how this might be implemented.
You would typically first coerce the types to the "larger" type, e.g. float + decimal -> float + float -> float, so you'd only have to document how the conversion is done and which accuracy to expect.
I would be concerned about the float + decimal automatically generating a float. Would it generate an error message if the pedantic flag was set? Would it generate a warning in safe mode?
Also, why do you consider a float to be a "larger" value type than decimal? Do you mean that a float is less precise?
See Guido's post on this.
"Larger" usually means two things: - more precise - larger range
This does not necessarily make the decision any easier, though, since the two terms don't discriminate numeric values too well, e.g. how do complex and longs fit into the picture ? As a result, this becomes a language design question.
Michael McLay wrote:
Those were easy. How would the following be interpreted?
decimal 3.404, 2) decimal 3.405, 2) decimal(3.39999, 2)
[...]
For addition, subtraction, and multiplication the results would be exact with no rounding of the results. Calculations that include division the number of digits in a non-terminating result will have to be explicitly set. Would it make sense for this to be definedby the numbers used in the calculation? Could this be set in the module or could it be global for the application?
This is why Cowlishaw et al require a full context for all operations. At one point I tried implementing things with the context being contained in the number rather than "global" (which actually means thread-global, but I'm probably punting on *that* bit for the moment), but Tim Peters persuaded me that sticking with the spec was the Right Thing until *after* the spec was fully implemented.
After seeing the mess generated by PEP-238, I'm fervently in favor of sticking with external specs whenever possible.
On Tuesday 31 July 2001 12:37 pm, Aahz Maruch wrote:
Michael McLay wrote:
For addition, subtraction, and multiplication the results would be exact with no rounding of the results. Calculations that include division the number of digits in a non-terminating result will have to be explicitly set. Would it make sense for this to be definedby the numbers used in the calculation? Could this be set in the module or could it be global for the application?
This is why Cowlishaw et al require a full context for all operations. At one point I tried implementing things with the context being contained in the number rather than "global" (which actually means thread-global, but I'm probably punting on *that* bit for the moment), but Tim Peters persuaded me that sticking with the spec was the Right Thing until *after* the spec was fully implemented.
After seeing the mess generated by PEP-238, I'm fervently in favor of sticking with external specs whenever possible.
I had originally expected the context for decimal calculations to be the module in which a statement is defined. If a function defined in another module is called the rules of that other module would be applied to that part of the calculation. My expectations of how Python would work with decimal numbers doesn't seem to match what Guido said about his conversation with Tim, and what you said in this message.
How can the rules for using decimals be stated so that a newbie can understand what they should expect to happen? We could set a default precision of 17 digits and all calculations that were not exact would be rounded to 17 digits. This would match how their calculator works. I would think this would be the model with the least suprises. For someone needing to be more precise, or less precise, how would this rule be modified?
Michael McLay wrote:
I had originally expected the context for decimal calculations to be the module in which a statement is defined. If a function defined in another module is called the rules of that other module would be applied to that part of the calculation. My expectations of how Python would work with decimal numbers doesn't seem to match what Guido said about his conversation with Tim, and what you said in this message.
How can the rules for using decimals be stated so that a newbie can understand what they should expect to happen? We could set a default precision of 17 digits and all calculations that were not exact would be rounded to 17 digits. This would match how their calculator works. I would think this would be the model with the least suprises. For someone needing to be more precise, or less precise, how would this rule be modified?
I intend to have more discussions with Cowlishaw once I finish implementing his spec, but I suspect his answer will be that whoever calls the module should set the precision.