[Python-Dev] PEP for adding a decimal type to Python

Michael McLay mclay@nist.gov
Thu, 26 Jul 2001 03:40:36 -0400


PEP: XXX
Title: Adding a Decimal type to Python
Version: $Revision:$
Author: mclay@nist.gov <mclay@nist.gov>
Status: Draft
Type: ??
Created: 25-Jul-2001
Python-Version: 2.2


Abstract

    Several PEPs have been written about fixing Python's numerical
    types.  The proposed changes raise issues about breaking backwards
    compatibility in the process. Changing the existing numerical types
    can be avoided by introducing a decimal number type. This change
    will also enhance the utility of Python for several key markets.

    A decimal type is also a natural super-type of both integers and
    floating point numbers.  This makes it an important root type for an
    inheritance tree of numerical types.

    This PEP suggests adding the decimal number type to Python in such
    a way that the existing number types will be the default type for
    .py files and the python command and the new decimal number type
    will be used for .dp files and the dpython command.


Rationale

    Conflicts surface in the discussion of language design when
    programming goals differ.  One example of this is found when
    selecting the best method for interpreting numerical values.  The
    correct answer is dependent on the application domain of the
    software. While Python is very good at providing a simple
    generalized language, it is not an ideal language in all
    cases. 

    For developers of scientific application the use of binary
    numbers, are often important for performance reasons.  The
    developers of financial application need to use decimal numbers in
    order to control roundoff errors.  Decimal numbers are also best
    for newbie users because decimal numbers have simpler rules and
    fewer surprises.

    The current implementation of numbers in Python is limited to a
    binary floating point type (both imaginary and real) and two types
    of integers.  This makes the language suitable for scientific
    programming.  Python is also suitable for domains which do not
    make use of numerical types. 

    Changing the existing python implementation to use decimal numbers
    and the default type for literals is likely to irritate scientific
    programmers.  Having to use special notation for decimal
    literals will make financial application developers second class
    citizen.  Both groups can coexist and share compiled modules by
    making the parser of Python sensitive to the context of the
    syntax.  This can be done by adding a new decimal type and then
    selectively changing the definition of default literals (that is a
    literal without a type suffix). In the proposed implementation the
    .py files and the python command would continue to parse numerical
    literals as they currently are interpreted.  The new decimal
    type would be used for number literals for .dp files and the
    dpython command. 


Proposal

    A new decimal type will be added to Python.  The new type
    will be based on the ANSI standard for decimal numbers.  The
    proposal will also add two new literal for representing numbers
    A decimal literal will have a 'd' appended to the number
    and a float literal or an integer literal will have a 'f' appended
    to the number. The current '.py' file and the use of the python
    command will continue to use the existing float and integer types
    for the number literals without a suffix.  

    The proposed change will add support for a second file type
    with a '.dp' suffix. There will also be an alternative command
    name, 'dpython', for the Python executable.  The decimal number
    will be used for the interpretation numerical literals in a '.dp'
    file and when using the 'dpython' command. The following examples
    illustrate the two commands.

    $ ./dpython
    Python 2.2a1 (#87, Jul 26 2001, 11:07:58)
    [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> type(21.2)
    <type 'decimal'>
    >>> type(21.2f)
    <type 'float'>
    >>> type(21f)
    <type 'int'>
    >>> 21.2f
    21.199999999999999
    >>> 21.2
    2.12
    >>> 1f/2f
    0
    >>> 1/2
    0.5
    >>>
    $./python
    Python 2.2a1 (#87, Jul 26 2001, 11:07:58)
    [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> type(21.2)
    <type 'float'>
    >>> type(21.2f)
    <type 'float'>
    >>> type(21.2d)
    <type 'decimal'>
    >>> 1/2
    0
    >>> 21.2
    21.199999999999999
    >>> 21.2d
    21.2

    The new decimal type is a "super-type" of float, integer, and
    long, so when decimal math is used there are only decimal
    numbers, regardless of whether it is an integer or a floating
    point number. Newbies and developers of financial applications
    would use the dpython command and the '.dp' suffix for modules.  
    The language will remain unchanged for existing programs.

    The addition of a decimal type that can be sub-classed may
    eliminate the need to add inheritance to float or integer types.
    The inheritance from float and integer are likely to be
    challenging. How will the inheritance from the float or integer
    type work?  The definition and implementation of these types are
    dependent on the C compiler used to compile the interpreter.

    By contrast, a new decimal type could be designed to be highly
    customizable.  The implementation could be implemented like class
    instances with a dictionary that starts out with three members, a
    sign, a coefficient, and an exponent. This basic type could be
    extended with flags that set the type of rounding to be used, or 
    by adding a member that sets the precision of the numbers, or
    perhaps a minimum and maximum value member could be added. 

    Adding the new file type is also an opportunity to fix some other
    ugliness in Python.  The tab character could be eliminated in
    from block indentation.  The default character type could be set to
    Unicode. (In dpython a 'b' would be added to the front of strings
    that are sequences of bytes.)  Using Unicode as the default has
    one important downside.  The change would limit the viewing of
    the '.dp' files to display devices that are Unicode enabled. This
    may have been a problem five years ago. Would it be today? 

    --- need to add other improvement that could be done in dpython ---

Backwards Compatibility

    The proposed change is backward compatible with the existing
    syntax when the python command is used. The new dpython command
    would be used to take advantage of the new language syntax.  The
    python command will have access to the decimal number type and the
    dpython command will have access to the traditional float and
    integer types. Both versions of the language could be used to
    write exactly the same programs that generate exactly the same
    byte code output.  The only difference will be a few syntax
    improvements in the dpython language.


Prototype Implementation

    An implementation of this PEP has been started, but has not been
    completed.  The parsing works as described, and a partial
    implementation of a decimal type has been started.  The prototype
    implementation of the decimal object is sufficient for testing the
    approach of mingling dpython and python.  The design of the
    current implementation does not support sub-classing. This minimal
    implementation of a decimal type could be completed with a days
    work. The development of an extendable type, as was described
    above, could take place in a later release.

    The interpretation of number literal that does not have a suffix
    is determined in in the parsetok() function.  The function adds a
    'd' or 'f' flag to any numerical literal that does not already
    have a number type suffix. The suffix attached to the numerical
    literal is based on the command used to invoke the parser or the
    suffix of the filename.  The parsenumber() function in compile.c
    file was modified to key off the number type suffix.  This type
    indicator is used in a switch statement for compiling the text of
    the literal into to the correct type of number.

    The implementation of the decimal type was created by copying the
    complexobject.[hc] files and then doing a global replace of the
    word complex with the word decimal.  The PyDecimal_FromString
    method in decimalobject.c interprets the string encoding of a
    decimal number correctly and populates the data structure that
    contains the sign, coefficient, and exponent values of a decimal
    number. A minimal printing of the decimal number has been enabled.
    It is hard-coded to just print out a scientific notation of the
    number.  The only operator that works properly at this time is
    negation operator defined in decimal_neg(). The d_sum() and d_prod()
    function have been started, but they are very broken.  No work has
    been done on implementing the d_quot() function.  The example that
    shows integer division working properly above was done by editing
    the output.  The format of the echoed decimal number was also edited.

    When a directory in the path contains a '.dp' module and a '.py'
    module with the same module name the '.dp' module is used.

    The prototype implementation is available at http://www.gencam.org/python
    The implementation has only be tested on Mandrake Linux 8.0.

Known Problems and Questions

    The parsetok.c file was duplicated and renamed to parsetok2.c
    because the pgen program could not resolve the Py_GetProgramName()
    function.  

    The dpython repr() function should probably return a number with a
    suffix of 'd' for decimal types if the module is a '.py' module or
    if the python command is used. Should the repr() function add the
    'f' suffix to float and integer values when accessed from a '.dp'
    module or the dpython command is used? 

Common Objections
 
    Adding a new type results in more rules to remember regarding the
    use of numbers in Python. 

    Response:  

    In general the rules for using a the decimal number type will
    be simpler than the rules governing the current set of numerical
    types.  This should make it easier for newbies to learn the
    dpython language.

    The benefits to the users who need a decimal type are significant
    and the added rules will primarily impact these users.  The
    decimal numbers are more precise, which is essential for some
    application domains. The decimal number rules will tend to
    simplify the use of python for these applications.

    The types used in an application will most likely be selected to
    match the user's requirements. Crossover between the new decimal
    types and the classic types will be infrequent. For cases where
    types must be mixed the language will be explicit. There will be 
    no automatic coercion between the types.  Exceptions will be
    raised if an explicit conversion isn't used. 

    Having two languages will confuse users.

    Response:

    This is unlikely to be a problem because there will rarely be a
    python module that requires both types of numbers.  If number
    types must be mixed in a module the proposed syntax provides an
    easy method to visually distinguish between the different number
    types. When types are mixed the choice between python and dpython
    will probably be dictated by the domain of the application developer.

    The distinction between python and dpython disappears once the
    language syntax has been compiled.  The only problem that might
    occur is in recognizing which language version is being used when
    editing a module.  An IDE can minimize the chances of confusion by
    using different background colors or highlighting schemes to
    distinguish between the versions of the language.  Anyone still
    using vi on a black and white monitor will just have to remember
    the name of the file being edited. (Which is probably how they
    think it should be:-)

    Shouldn't the root numerical type be a rational type?

    Response:

    ???


References

    [1] ANSI standard X3.274-1996.  
        (See http://www2.hursley.ibm.com/decimal/deccode.html)

Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: