
Hi! I've been working on C decimal project during gSoC 2006. After year of idling (I had extremely busy first year on University, but well, most of us are extremely busy) I decided, that I will handle further developing (there is still much development needed, and updating to most recent standard is just the beginning). I understand, that chances of merging C Decimal with main tree are much lower than year ago, so I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'? I've made a little benchmark - given loan amount, assets and months that it's meant to be payed off, find minimal monthly loan cost (It's just first idea that came to me for use Decimal in financial 'application' :>) [This solution has complexity O(log(amount) * months) which is far from optimal, but it's meant to benchmark Decimal, not python itself]. Code: from _decimal import * import sys gc = getcontext(); def check(loan, percent, monthly): ret = 0 mult = 1 + (percent / 1200) if (loan - monthly) * mult >= loan: return -1 #you cannot payoff loan ;( while loan > 0: loan = loan - monthly loan = loan * mult ret += 1 return ret def minimize_monthly(loan, percent, months): lower = Decimal(0) upper = Decimal(loan) while(upper > lower + Decimal("1e-3")): mid = (upper + lower)/2 ret = check(loan, percent, mid) if(ret > months or ret == -1): lower = mid else: upper = mid return lower gc.prec = int(sys.argv[4]) gc.rounding = ROUND_UP print minimize_monthly(Decimal(sys.argv[1]), Decimal(sys.argv[2]), int(sys.argv[3])) and timings (1mln loan, for 15 years, 2% year assets, and precision = 10 :>): mateusz@MatLaps:~/programy/python/decimal/decimal-c$ time ../../pyth/python/python loan.py 1000000 2 180 10 6424.37955 real 0m0.068s user 0m0.064s sys 0m0.004s mateusz@MatLaps:~/programy/python/decimal/decimal-c$ time ../../pyth/python/python loan2.py 1000000 2 180 10 6424.37955 real 0m2.168s user 0m2.148s sys 0m0.016s Please don't misunderstand me - I don't want to show python Decimal is slow, I want to show that C Decimal is worth effort. I am also aware of simplicity of this benchmark. (This python have been of course compiled with -O3). Best regards, Mateusz Rukowicz.

+1 from me. If you update it to the most recent Decimal standard I think its worth it. anyone else agree? On 10/15/07, Mateusz Rukowicz <mateusz.rukowicz@vp.pl> wrote:
Hi!
I've been working on C decimal project during gSoC 2006. After year of idling (I had extremely busy first year on University, but well, most of us are extremely busy) I decided, that I will handle further developing (there is still much development needed, and updating to most recent standard is just the beginning). I understand, that chances of merging C Decimal with main tree are much lower than year ago, so I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'?
I've made a little benchmark - given loan amount, assets and months that it's meant to be payed off, find minimal monthly loan cost (It's just first idea that came to me for use Decimal in financial 'application' :>) [This solution has complexity O(log(amount) * months) which is far from optimal, but it's meant to benchmark Decimal, not python itself].
Code:
from _decimal import * import sys gc = getcontext();
def check(loan, percent, monthly): ret = 0 mult = 1 + (percent / 1200) if (loan - monthly) * mult >= loan: return -1 #you cannot payoff loan ;(
while loan > 0: loan = loan - monthly loan = loan * mult ret += 1 return ret
def minimize_monthly(loan, percent, months): lower = Decimal(0) upper = Decimal(loan)
while(upper > lower + Decimal("1e-3")): mid = (upper + lower)/2 ret = check(loan, percent, mid) if(ret > months or ret == -1): lower = mid else: upper = mid
return lower
gc.prec = int(sys.argv[4]) gc.rounding = ROUND_UP print minimize_monthly(Decimal(sys.argv[1]), Decimal(sys.argv[2]), int(sys.argv[3]))
and timings (1mln loan, for 15 years, 2% year assets, and precision = 10 :>): mateusz@MatLaps:~/programy/python/decimal/decimal-c$ time ../../pyth/python/python loan.py 1000000 2 180 10 6424.37955
real 0m0.068s user 0m0.064s sys 0m0.004s mateusz@MatLaps:~/programy/python/decimal/decimal-c$ time ../../pyth/python/python loan2.py 1000000 2 180 10 6424.37955
real 0m2.168s user 0m2.148s sys 0m0.016s
Please don't misunderstand me - I don't want to show python Decimal is slow, I want to show that C Decimal is worth effort. I am also aware of simplicity of this benchmark. (This python have been of course compiled with -O3).
Best regards, Mateusz Rukowicz. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg%40krypto.org

On 10/15/07, Mateusz Rukowicz <mateusz.rukowicz@vp.pl> wrote:
[...] I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'?
I'd be happy to see decimal.py replaced by a C version giving essentially the same functionality. I think the current decimal would certainly benefit from a speedup: it's probably `fast enough' for many applications, but it suffers horribly at high precisions: addition of two n-digit decimals takes time quadratic in n, for example. Mark

Mark Dickinson wrote:
On 10/15/07, Mateusz Rukowicz <mateusz.rukowicz@vp.pl> wrote:
[...] I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'?
I'd be happy to see decimal.py replaced by a C version giving essentially the same functionality. I think the current decimal would certainly benefit from a speedup: it's probably `fast enough' for many applications, but it suffers horribly at high precisions: addition of two n-digit decimals takes time quadratic in n, for example.
Well, I am pretty sure, that addition works in linear time in Python version :>. Speaking of performance in high precision computation - a year ago that was my aim, to make high precision computation fast, but now I see it different way. That is - I am not really convinced, if ie. adding 2k+ lines just to make multiplying fast (some fft-like multiplication) is worth it - depends on how many people would like to perform computations with prec around 100k+ digits ;). So at the moment, pretty much everything will work in asymptotically the same time as in Python version, but will be faster by some constant (which is quite big anyway). It will make C version suitable for medium precision computation (let's say 1000-10k digits ;>) - of course, if there is demand for high speed high precision, I would love to implement that (that's what I wanted at the beginning), but in that case, project would really overgrow (it already is really big as for one C file ;). All I am saying is I want to make situation with many low precision arithmetic operations works fast (just like my tiny and pretty useless benchmark ;P - but I was surprised by result ;>) - and because I write in C, it is not really hard ;). PS I think, that Py and C version of Decimal may live together and cooperate ;>. PSS Py Decimal uses Python multiplication, so this one is asymptotically better than mine. Thanks for opinion :)

On 10/16/07, Mateusz Rukowicz <mateusz.rukowicz@vp.pl> wrote:
Well, I am pretty sure, that addition works in linear time in Python version :>.
Unfortunately not. Here are some timings from my laptop:
from timeit import Timer Timer("x+x", "from decimal import Decimal; x = Decimal('1'*5000)").timeit(100) 8.1198890209197998 Timer("x+x", "from decimal import Decimal; x = Decimal('1'*10000)").timeit(100) 29.203926086425781 Timer("x+x", "from decimal import Decimal; x = Decimal('1'*20000)").timeit(100) 109.60141491889954 Timer("x+x", "from decimal import Decimal; x = Decimal('1'*40000)").timeit(100) 492.15129995346069
I'm almost sure that adding 40000 digit numbers together is not what Decimal was intended to be used for, but it still seems unreasonable that it takes almost 5 seconds to do such an addition. The reason for the quadratic behaviour is that almost all the arithmetic routines in decimal.py, at some point, convert the coefficient of their argument(s) from a tuple of digits to a Python integer, and then do the reverse conversion to get a Decimal result; both of these conversions (tuple of digits <-> integer) take time quadratic in the size of the tuple/integer. This means that multiplication of Decimals is also quadratic time, even though it makes use of Python's subquadratic Karatsuba multiplication. The alternative would be to implement addition digit-by-digit in decimal.py; this would be asymptotically linear but would be much slower for the low precision ( < 50 digits, say) decimals that almost everybody is going to be using in practice---clearly not a good tradeoff. So one attraction of the C version of decimal is that with little effort it gets the best of both worlds: addition *is* just digit-by-digit (well, okay, limb-by-limb) but since it's coded in C it's fast enough for regular users. So you get fast addition in the usual case, with good asymptotics.
Speaking of performance in high precision computation - a year ago that was my aim, to make high precision computation fast, but now I see it different way. That is - I am not really convinced, if ie. adding 2k+ lines just to make multiplying fast (some fft-like multiplication) is worth it - depends on how many people would like to perform computations with prec around 100k+ digits ;).
I quite agree: people who want high-speed high-precision arithmetic should probably be using some binary floating-point package like GMP or MPFR. It's difficult to believe that there's much of a market for high-precision decimal floating point. Mark

Mark Dickinson wrote:
I'm almost sure that adding 40000 digit numbers together is not what Decimal was intended to be used for, but it still seems unreasonable that it takes almost 5 seconds to do such an addition. The reason for the quadratic behaviour is that almost all the arithmetic routines in decimal.py, at some point, convert the coefficient of their argument(s) from a tuple of digits to a Python integer, and then do the reverse conversion to get a Decimal result; both of these conversions (tuple of digits <-> integer) take time quadratic in the size of the tuple/integer. This means that multiplication of Decimals is also quadratic time, even though it makes use of Python's subquadratic Karatsuba multiplication.
Oh right, my mistake :> -- I looked at python code, but I forgot about conversion ;).

On 10/16/07, Mark Dickinson <dickinsm@gmail.com> wrote:
The alternative would be to implement addition digit-by-digit in decimal.py; this would be asymptotically linear but would be much slower for the low precision ( < 50 digits, say) decimals that almost everybody is going to be using in practice---clearly not a good tradeoff.
There is another alternative, which is to use integers exclusively for both representation and arithmetic, and only compute an explicit digit tuple or string in special cases. I'm doing this in in mpmath (http://code.google.com/p/mpmath/), which is about 10x faster than decimal.py at low precision (and of course asymptotically much faster). However, a significant part of that improvement may not be possible to achieve when the rounding has to be done in decimal instead of binary. A more radical proposal would be to change Python's long type to use a radix-10**n representation (Python 3000 or beyond?). An implementation of decimal floating-point arithmetic on top of it, whether written in C or pure Python (if some utility C functions such as for counting the number of digits an integer were available), would be both light-weight and efficient at high precision. Fredrik

On 10/16/07, Fredrik Johansson <fredrik.johansson@gmail.com> wrote:
There is another alternative, which is to use integers exclusively for both representation and arithmetic, and only compute an explicit digit tuple or string in special cases. I'm doing this in in mpmath (http://code.google.com/p/mpmath/), which is about 10x faster than decimal.py at low precision (and of course asymptotically much faster). However, a significant part of that improvement may not be possible to achieve when the rounding has to be done in decimal instead of binary.
This is something that I've also thought about on a number of occasions. Rounding would be a major concern here: a simple linear operation (e.g. rounding a 2n-digit number to n digits) would become quadratic, or at best subquadratic with fancy division algorithms. There are also some esoteric functions in the decimal library that require easy access to decimal digits, for example the logical operations logical_and, logical_xor, etc., that would be slowed by this approach. But it may well be that for normal use this would be the best way to go for decimal.py. Perhaps some testing is in order...
A more radical proposal would be to change Python's long type to use a radix-10**n representation (Python 3000 or beyond?).
Mightn't this produce significant (constant factor) slowdowns for long performance? As I understand it, the main problem with a base 10**n representation is that, for example, extracting the high and low limbs of the 2-limb result of a 1-limb by 1-limb multiplication requires a division, whereas for integers stored in binary that division can be replaced by bit operations.
An implementation of decimal floating-point arithmetic on top of it, whether written in C or pure Python (if some utility C functions such as for counting the number of digits an integer were available), would be both light-weight and efficient at high precision.
Agreed. But it might be hard to sell a more efficient decimal module at the expense of slower integer arithmetic in core Python. Mark

Mark Dickinson wrote:
On 10/16/07, Fredrik Johansson <fredrik.johansson@gmail.com> wrote:
A more radical proposal would be to change Python's long type to use a radix-10**n representation (Python 3000 or beyond?).
Mightn't this produce significant (constant factor) slowdowns for long performance? As I understand it, the main problem with a base 10**n representation is that, for example, extracting the high and low limbs of the 2-limb result of a 1-limb by 1-limb multiplication requires a division, whereas for integers stored in binary that division can be replaced by bit operations.
An implementation of decimal floating-point arithmetic on top of it, whether written in C or pure Python (if some utility C functions such as for counting the number of digits an integer were available), would be both light-weight and efficient at high precision.
Agreed. But it might be hard to sell a more efficient decimal module at the expense of slower integer arithmetic in core Python.
My 2 cents -- integer division and modulo (needed for 10**n radix) is so slow, that in some extremal cases algorithm may slow down by a factor of 4-5, so I guess it's not an option. Not to mention binary logical operations.

On 10/16/07, Mark Dickinson <dickinsm@gmail.com> wrote:
I'm almost sure that adding 40000 digit numbers together is not what Decimal was intended to be used for, but it still seems unreasonable that it takes almost 5 seconds to do such an addition. The reason for the quadratic behaviour is that almost all the arithmetic routines in decimal.py, at some point, convert the coefficient of their argument(s) from a tuple of digits to a Python integer, and then do the reverse conversion to get a Decimal result; both of these conversions (tuple of digits <-> integer) take time quadratic in the size of the tuple/integer.
Instead of (or in addition to) porting to C, wouldn't it be better to improve the conversion algorithm? Radix conversion can be done in O(n log**2 n) time using a divide and conquer algorithm. Such an algorithm can be found at the link below (search for "Radix conversion"): http://people.cis.ksu.edu/~rhowell/calculator/comparison.html -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC

On 10/16/07, Daniel Stutzbach <daniel.stutzbach@gmail.com> wrote:
On 10/16/07, Mark Dickinson <dickinsm@gmail.com> wrote:
the reverse conversion to get a Decimal result; both of these conversions (tuple of digits <-> integer) take time quadratic in the size of the tuple/integer.
Instead of (or in addition to) porting to C, wouldn't it be better to improve the conversion algorithm?
I certainly wouldn't mind seeing subquadratic integer -> string and string -> integer conversions in core Python. But I guess somebody's got to write them, and then persuade the powers-that-be that the extra code and complexity are worth it... I'm not volunteering right now :). Though I do have Python string <-> integer code that's faster than the builtins from around 8000 decimal digits onwards, if anyone wants to use it as a starting point... The desire for fast integer -> string has also come up on the bug tracker recently: see http://bugs.python.org/issue1746088
Radix conversion can be done in O(n log**2 n) time using a divide and conquer algorithm.
Isn't there a log(log n) missing here? In any case, it seems to me that achieving this sort of complexity is probably best left to GMP and the like. But a basic subquadratic division based on Burnikel and Ziegler's 'Fast Recursive Division' paper seems within reach---this would give division and radix conversion essentially the same complexity as Karatsuba multiplication. Mark

On 10/16/07, Mark Dickinson <dickinsm@gmail.com> wrote:
Radix conversion can be done in O(n log**2 n) time using a divide and conquer algorithm.
Isn't there a log(log n) missing here?
Possibly, but who's counting? :-)
In any case, it seems to me that achieving this sort of complexity is probably best left to GMP and the like. But a basic subquadratic division based on Burnikel and Ziegler's 'Fast Recursive Division' paper seems within reach---this would give division and radix conversion essentially the same complexity as Karatsuba multiplication.
I agree. A basic subquadratic radix conversion algorithm isn't much more complex than the existing quadratic code. I just whipped together a Python prototype and it's only 15 lines. The quadratic algorithm is basically a divide-and-conquer algorithm, too, except it's the bad kind that breaks the input into pieces of size O(1) and size O(n) instead of pieces of size O(n/2). :-) (where n is number of digits) -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC

2007/10/16, Daniel Stutzbach <daniel.stutzbach@gmail.com>:
I agree. A basic subquadratic radix conversion algorithm isn't much more complex than the existing quadratic code. I just whipped together a Python prototype and it's only 15 lines.
Do you have a patch for decimal.py of current trunk? Don't be afraid of submitting it in the Tracker and assign it to me... Thank you! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On 10/19/07, Facundo Batista <facundobatista@gmail.com> wrote:
2007/10/16, Daniel Stutzbach <daniel.stutzbach@gmail.com>:
I agree. A basic subquadratic radix conversion algorithm isn't much more complex than the existing quadratic code. I just whipped together a Python prototype and it's only 15 lines.
Do you have a patch for decimal.py of current trunk?
I don't, though I could make one. However, currently the int->Decimal conversion takes place in C via str(). Here's the relevant line from decimal.py: self._int = tuple(map(int, str(abs(value)))) Doing some simple experiments, a Python subquadratic routine doesn't make a big pay off until around 25,000 decimal digits. On the plus side, the extra overhead of the additional routine is small and I didn't observe a noticeable performance penalty for small inputs (my routine calls str() as a base case for values less than 2**31-1). So... would the community rather have a Python patch for decimal.py or a patch for subquadratic versions of long_format and PyLong_FromString in longobject.c? The later will be faster and will speed up some non-Decimal operations as well. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC

2007/10/15, Mateusz Rukowicz <mateusz.rukowicz@vp.pl>:
I've been working on C decimal project during gSoC 2006. After year of idling (I had extremely busy first year on University, but well, most of us are extremely busy) I decided, that I will handle further
Welcome back, :)
merging C Decimal with main tree are much lower than year ago, so I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'?
First of all you need to address some issues raised by some people here after you finished your work last year. I remember of Raymond Hattinger, but maybe there were others. After that, you should update the C version to comply the spec in its last version. Now that we have more prefixes in Py3k (0b110101, for example), we can push to have something like 0d1.34. Or even that 1.34 is decimal by default, and have a 0f1.34 if you want binary floating point. Or that that behaviour is selectable system wide somehow. What people do you think is the future here?
I've made a little benchmark - given loan amount, assets and months that it's meant to be payed off, find minimal monthly loan cost (It's just
I will prepare, just for decimal.py, a benchmark that is a mix of all operations and use (a mix of common operations like add, not so used ones like log10, context switching, exceptions raised, etc). You can use this if you want, to measure also the difference in speed from Py to C. Note, however, that you need to update it first for the last spec. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Facundo Batista wrote:
2007/10/15, Mateusz Rukowicz <mateusz.rukowicz@vp.pl>:
I've been working on C decimal project during gSoC 2006. After year of idling (I had extremely busy first year on University, but well, most of us are extremely busy) I decided, that I will handle further
Welcome back, :)
merging C Decimal with main tree are much lower than year ago, so I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'?
First of all you need to address some issues raised by some people here after you finished your work last year. I remember of Raymond Hattinger, but maybe there were others.
After that, you should update the C version to comply the spec in its last version.
Now that we have more prefixes in Py3k (0b110101, for example), we can push to have something like 0d1.34. Or even that 1.34 is decimal by default, and have a 0f1.34 if you want binary floating point. Or that that behaviour is selectable system wide somehow. What people do you think is the future here?
I think you should forget any idea of making decimal the default numeric literal type (and anyway would you do that only for literals containing decimal points?). Using a radix notation for literals would, IMHO, be acceptable if you can get the idea accepted (it is, after all, only syntactic sugar with a little memory saving and the transfer of some computation to compile time).
I've made a little benchmark - given loan amount, assets and months that it's meant to be payed off, find minimal monthly loan cost (It's just
I will prepare, just for decimal.py, a benchmark that is a mix of all operations and use (a mix of common operations like add, not so used ones like log10, context switching, exceptions raised, etc). You can use this if you want, to measure also the difference in speed from Py to C. Note, however, that you need to update it first for the last spec.
regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Sorry, the dog ate my .sigline so I couldn't cat it

Steve Holden wrote:
Using a radix notation for literals would, IMHO, be acceptable if you can get the idea accepted
This would make decimal (or at least a part of it) a required part of the Python core rather than an optional module. There might be some resistance to that. -- Greg

Facundo Batista wrote:
2007/10/15, Mateusz Rukowicz <mateusz.rukowicz@vp.pl>:
Welcome back, :)
Hi :>
merging C Decimal with main tree are much lower than year ago, so I would like to know if there is still interest in C version of Decimal. If so - should I write PEP, or just code and 'we'll see later'?
First of all you need to address some issues raised by some people here after you finished your work last year. I remember of Raymond Hattinger, but maybe there were others.
Yes, I remember Raymond's points about C Decimal, and I keep them in mind. Mainly it was about implementation being too much python dependent - and because of that code complexity and performance. Fortunately, the code is written such way (mainly thanks to Georg Brandl and Jack Diedrich, who started project), that in my opinion, modifications of context, and other highly-python dependent parts are pretty easy - these changes are open to discussion (but not yet though). Complexity - I tried to do my best, but I am aware that there are parts which are hard to understand and/or are miscommented. I will gradually improve that. I've been searching few weeks ago for other opinions, but I haven't found anything.
After that, you should update the C version to comply the spec in its last version.
That's what I am doing now.
I will prepare, just for decimal.py, a benchmark that is a mix of all operations and use (a mix of common operations like add, not so used ones like log10, context switching, exceptions raised, etc). You can use this if you want, to measure also the difference in speed from Py to C. Note, however, that you need to update it first for the last spec.
Great :>.
participants (8)
-
Daniel Stutzbach
-
Facundo Batista
-
Fredrik Johansson
-
Greg Ewing
-
Gregory P. Smith
-
Mark Dickinson
-
Mateusz Rukowicz
-
Steve Holden