[Patches] [ python-Patches-1020845 ] Decimal performance enhancements

Thu Sep 9 01:45:40 CEST 2004

Patches item #1020845, was opened at 2004-09-02 10:08
Message generated for change (Comment added) made by ncoghlan
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1020845&group_id=5470

Category: Library (Lib)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Nick Coghlan (ncoghlan)
Assigned to: Facundo Batista (facundobatista)
Summary: Decimal performance enhancements

Initial Comment:
The attached patch implements a collection of
performance enhancements for decimal.py. They are
currently focused on the performance of the telco
benchmark in the sandbox (i.e. they aim to speed up
instantiation, addition, multiplication and quantization).

The main strategy used (a subclass for NaN and infinite
values) can be extended to encompass more methods in
order to improve performance in other situations.

There is one non-performance related change included -
the '_sign' attribute is renamed to '_isneg' to better
reflect the boolean nature of the property.

----------------------------------------------------------------------

>Comment By: Nick Coghlan (ncoghlan)
Date: 2004-09-09 09:45

Message:
Logged In: YES 
user_id=1038590

2nd version of patch attached. It does NOT use a subclassing
strategy for optimisation due to the following charming
little problem I found with that approach:

>>> import decimal
>>> class foo(decimal.Decimal): pass
>>> x = foo("NaN")
>>> isinstance(x, foo)
False

I couldn't find any sane way to get around this, and it
seemed to be a showstopper given the amount of work that has
gone into making all the builtin types easily subclassable.
The new patch instead relies mainly on the following techniques:
  - defer invocation of getcontext() as late as possible, so
that we only call it if the context is going to be referenced
  - use issubclass(x, basestring) to identify if there are
special cases that need handling, then use _is_nan and
_is_infinity to determine exactly which special case applies
(this slows the special cases down slightly, but speeds up
the normal cases a lot)
  - make _convert_other a helper function instead of a method

There are also some other minor changes that were present in
the original patch (use sum() in __nonzero__, reduce the
number of calls to list.reversed() in __add__, extract
digits directly from integer inputs with divmod, rearrange
_fixexponents to minimise context method calls, cache
results of method calls in various operations)

Facundo, two particular changes I'd like you to OK are:
 - _fix now indents the second call to _fixexponents to be
inside the if block (in CVS, _fix_exponents could get called
again directly on the result of the previous call to
_fix_exponents, which didn't seem to make any sense)
  - _fixexponents returns immediately for NaN as well as
infinite values

This version of patch appears to give around a 40% speedup
on direct execution of test_decimal.py, and all the tests
still pass for me.

----------------------------------------------------------------------

Comment By: Facundo Batista (facundobatista)
Date: 2004-09-08 09:47

Message:
Logged In: YES 
user_id=752496

Nick:

Submit the second version of the patch and I'll review it.
Also feel free to contact me by mail anytime.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-08 07:59

Message:
Logged In: YES 
user_id=80475

Glad to hear you've wrapped your head around the sign bit. 
FWIW, it bugged me too.  But it is so closely tied to the
spec that it is better to adjust our mental model than
change the code.

Do extend your optimizations to the other functions.  Don't
just optimize the benchmark -- that defeats the purpose.  

Can you work together with Facundo on this one?

----------------------------------------------------------------------

Comment By: Nick Coghlan (ncoghlan)
Date: 2004-09-08 07:06

Message:
Logged In: YES 
user_id=1038590

With the _sign/_isneg thing, I figured out the reason it was
hurting my brain - I was failing to think of the value as
representing a two's complement sign bit. Once I thought of
it that way, it made sense with the original terminology.

I suspect the lesser speedup on test_decimal.py may be due
to the fact that the current version of the patch only
optimises the operations used by the telco benchmark
(instantiation, addition, multiplication, quantisation). As
we speed up other operations, the performance on the unit
tests should improve.

As far as the API itself goes, the major differences I am
aware of are the existence of the new subclass, and the use
of __new__ instead of __init__. That will presumably be a
bit more work to implement in C since another class is
needed. Do we need to document somewhere that Decimal(x) may
return an instance of a subclass of Decimal() rather than an
instance of Decimal itself?

My current plan is to produce a second version of this patch
that:

1. Starts again from current CVS
2. Leaves '_sign' alone
3. Leaves existing doc strings alone (I broke a couple of
them in this version of patch - I hadn't really thought
about how to handle them properly)
4. For overridden methods in the _Special_Value class,
duplicate the docstring from the baseclass with "f.__doc__ =
Decimal.f.__doc__". This should avoid giving different
answers for "help(Decimal(1).__add__)" and
"help(Decimal('NaN').__add__)".
5. Optimises all operations, not just the four used by the
telco benchmark (with the new strategy in place, this is
mostly legwork - cutting and pasting the special-value
related stuff to the new subclass, and rearranging
appropriately. Unfortunately, it's legwork that I can't see
a sensible way to automate, so it's a matter of taking the
time to do it, and then checking that the tests all still pass).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-07 15:05

Message:
Logged In: YES 
user_id=80475

Facundo, do you have time to look at this one?

We need to:

* revert the _isneg variable name change

* find out why it gives a 2:1 gain on telco but only 10% on
test_decimal.py

* make sure the API has not changed in a way that will make
it more difficult someday to convert this into C and make it
a built-in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-02 15:18

Message:
Logged In: YES 
user_id=80475

Nice job.  The timings do improve and the approach is
reasonable.

Please go ahead to the next step and polish it up.

Also, please change _isneg back to _sign.  That matches the
terminology in the spec, the machine centric sign bit
viewpoint of 754R, and the components of the user visible
as_tuple() method.

The goal is to make the patch as minimal as possible, to
keep the API unchanged, and improve performance.

Using the __new__ method was smart.  That more closely
corresponds to a potential C implementation of an immutable
type.

----------------------------------------------------------------------

Comment By: Nick Coghlan (ncoghlan)
Date: 2004-09-02 10:11

Message:
Logged In: YES 
user_id=1038590

I forgot to add that this is NOT yet a production quality
patch. The doc strings, in particular, need work. I'm mainly
posting it so Raymond can have a look at it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1020845&group_id=5470