[Python-Dev] Decimal issues - Conclusion

Fri Jun 4 12:08:26 EDT 2004

People:

Sorry for the delay, but between the moment that the discussion in
python-dev settled down and now, my car got stolen, my grandmother
got into the hospital (brain accident), and I was designed as
Deployment Manager of the new SMS Center that bought the company I
work for.

Anyway, all the following is the discussion itself post-crunched.
The first three items are the result of the subjects for which the
discussion was initiated.  The other items are subjects that arose
alone.

A lot of new functionalities were asked for inclusion into the PEP,
but we're trying to keep it pure at this stage.  From Tim Peters:

    Piles of "convenience features" should wait until people actually
    use this in real life, so can judge what's truly clumsy based on
    experience instead of speculation.

Unless there're strong feelings against any point, my plan is to
include the results in the PEP 327 and send it again to Tim for a
re-review.  The only tricky part is that I propose something new at
the end of the first point. 

Thanks you all for the time and knowledge that invested in this.
Even the points that didn't got through to this resume added a lot of
value.

.	Facundo

Exponent maximum
----------------

The original post asked if to keep the maximum exponent value::

    DEFAULT_MAX_EXPONENT = 999999999
    DEFAULT_MIN_EXPONENT = -999999999
    ABSOLUTE_MAX_EXP = 999999999
    ABSOLUTE_MIN_EXP = -999999999

The general consenss is to keep the artificial limits only if there
are important reasons to do that.  Tim Peters gives us three:

    ...eliminating bounds on exponents effectively means overflow
    (and underflow) can never happen.  But overflow *is* a valuable
    safety net in real life fp use, like a canary in a coal mine,
    giving danger signs early when a program goes insane.

    Virtually all implementations of 854 use (and as IBM's standard
    even suggests) "forbidden" exponent values to encode non-finite
    numbers (infinities and NaNs).  A bounded exponent can do this at
    virtually no extra storage cost.  If the exponent is unbounded,
    then additional bits have to be used instead.  This cost remains
    hidden until more time- and space- efficient implementations are
    attempted.

    Big as it is, the IBM standard is a tiny start at supplying a
    complete numeric facility.  Having no bound on exponent size will
    enormously complicate the implementations of, e.g., decimal sin()
    and cos() (there's then no a priori limit on how many digits of
    pi effectively need to be known in order to perform argument
    reduction).

Edward Loper give us an example of when the limits are to be crossed:
probabilities.

Said that, Robert Brewer and Andrew Lentvorski want the limits to be
easily modifiable by the users.  Actually, this is quite posible::

    >>> d1 = Decimal("1e999999999")     # in the exponent limit
    >>> d1
    Decimal( (0, (1,), 999999999L) )
    >>> d1 * 10                         # exceed the limit, got infinity
    Decimal( (0, (0,), 'F') )
    >>> getcontext().Emax = 1000000000  # increase the limit
    >>> d1 * 10                         # does not exceed any more
    Decimal( (0, (1, 0), 999999999L) )  
    >>> d1 * 100                        # exceed again
    Decimal( (0, (0,), 'F') )

However, note that sometimes absolute maximum and minimum are
actually used, and they can not be changed.  Actually, I want to
change this:

    As long as all the good effects of a maximum can be achieved with
    modifiable maximum and minimum, I propose to not have absolute
    ones (in the Spec this is optional).

Hash behaviour
--------------

This point was about the behaviour of hash() on Decimals.

Community agrees that if the values are the same, also the hashes of
those values.  So, while Decimal(25) == 25 is True, hash(Decimal(25))
should be equal to hash(25). 

The detail is that you can NOT compare Decimal to floats or strings,
so we should not worry about them to give the same hashes.

In short::

    hash(n) == hash(Decimal(n))   # Only if n is int, long, or Decimal

New operations
--------------

There were three operations that I missed to put in the PEP:

- ``to-scientific-string``: Converts a number to a string using
  scientific notation if an exponent is needed.

- ``to-engineering-string``: Converts a number to a string using
  engineering notation if an exponent is needed.

- ``to-number``: Converts a string to a number.

The discussion originated the community wish to change the behaviour
of repr() and str().  Ka-Ping Yee proposes that repr() have the same
behaviour that str() and Tim Peters proposes to str() behave like the
to-scientific-string operation specified in the Spec.

This is posible, because (from Aahz): "The string form already
contains all the necessary information to reconstruct a Decimal
object".

And also complies with the Spec; Tim Peters:

    There's no requirement to have a method *named* "to_sci_string",
    the only requirement is that *some* way to spell to-sci-string's
    functionality be supplied.  The meaning of to-sci-string is
    precisely specified by the standard, and is a good choice for
    both str(Decimal) and repr(Decimal).

Tim also proposes a method as_tuple() to return the internal triple
tuple, the way repr() behaved until now.

So, in short:

- str() and repr() will behave as the to-scientific-string operation.
  Because of that we won't put a specific method for it.

- ``to_eng_string`` will be the name of the method for the
  to-engineering-string operation.

- As long as we can pass a string to create a Decimal using the
  context, we won't put a specific method for the to-number operation.

- We'll put a new method to return the internal triple tuple: as_tuple()

Create with other Context
-------------------------

This item arose from two sources in parallel.  Ka-Ping Yee proposes
to pass the context as argument at construct time (he wants the
context he passes to be used only in creation time: "It would not be
persistent").  Tony Meyer asks from_string to honor the context if
receives a parameter "honour_context" in True (I don't like it,
because the doc specify to honor the context and I don't want the
method to comply with the specification regarding the value of an
argument)

Tim Peters gives us a reason to have a creation that uses context:

    In general number-crunching, literals may be given to high
    precision, but that precision isn't free and *usually* isn't
    needed

Casey Duncan wants to use another method, not a bool arg:

    I find boolean arguments a general anti-pattern, especially given
    we have class methods. Why not use an alternate constructor like
    Decimal.rounded_to_context("3.14159265").

In the process of deciding the syntax of that, Tim came up with a
better idea: he proposes not to have a method in Decimal to create
with a different context, but having instead a method in Context to
create a Decimal instance.  Basically, instead of::

    D.using_context(number, context)

it will be::

    context.create_decimal(number)

>From Tim:

    While all operations in the spec except for the two to-string
    operations use context, no operations in the spec support an
    optional local context.  That the Decimal() constructor ignores
    context by default is an extension to the spec.  We must supply a
    context-honoring from-string operation to meet the spec.  I
    recommend against any concept of "local context" in any operation
    -- it complicates the model and isn't necessary.  

So, we decided to use a context method to create a Decimal that will
use (only to be created) that context in particular (fur further
operations it'll use the context of the thread).  But, a method with
what name?

I proposed the name create_decimal().  Tim Peters proposes three
methods to create from diverse sources::

    context.from_string(a_string)
    context.from_int(an_int_or_long)
    context.from_float(a_float)

Michael Chermside says to use the same method without caring about
the data type: "The name just fits my brain. The fact that it uses
the context is obvious from the fact that it's Context method".  I
think that it's ok because a newbie will not be using the creation
method from Context (the separate method in Decimal to construct from
float is just to prevent newbies from binary floating point issues).

So, in short, if you want to create a Decimal instance using a
particular context (that will be used just in creation time and not
further), you'll have to use a method of that context::

    mycontext.create_decimal(n)  # being n any datatype accepted in
                                 #   Decimal(n) plus float.

Example::

    >>> # create a standard decimal instance
    >>> Decimal("11.2233445566778899")  
    Decimal( (0, (1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9),
-16) )
    >>>
    >>> # create a decimal instance using the thread context
    >>> thread_context = getcontext()
    >>> thread_context.prec
    9
    >>> thread_contex.create_decimal("11.2233445566778899")
    Decimal( (0, (1, 1, 2, 2, 3, 3, 4, 4, 6), -7L) )
    >>>
    >>> # create a decimal instance using other context
    >>> other_context = thread_context.copy()
    >>> other_context.prec = 4
    >>> other_context.create_decimal("11.2233445566778899")
    Decimal( (0, (1, 1, 2, 2), -2L) )

New method in Context
---------------------

Paul Moore wants a copy method in Context, so I'll put copy() and
__copy__ methods in Context (but not in Decimal: it's immutable, it's
not needed).

Extracting a method from Decimal
--------------------------------

Tim Peters doesn't want round() because it's not in the Spec and you
can achieve the same result with quantize()::

    >>> d = Decimal("12345.6789")
    >>> dimes = Decimal('0.1')
    >>> print d.quantize(dimes)
    12345.7
    >>> print d.quantize(Decimal('1e1'))
    1.235E+4
    >>>

So, as long the PEP is for implementing the Spec, I'll drop round().