
Hi,
as some of you know, recently I've released an arbitrary precision C library for decimal arithmetic together with a Python module:
http://www.bytereef.org/libmpdec.html http://www.bytereef.org/fastdec.html
Both the library and the module have been tested extensively. Fastdec currently differs from decimal.py in a couple of ways that could be fixed. The license is AGPL, but if there is interest in integrating it into Python I'd release it under a Python-compatible license.
There have been several approaches towards getting C decimal arithmetic into Python:
http://bugs.python.org/issue2486
Fastdec follows Raymond Hettinger's suggestion to provide wrappers for an independent C implementation. Arguments in favour of fastdec are:
* Complete implementation of Mike Cowlishaw's specification
* C library can be tested independently
* Redundant arithmetic module for tests against decimal.py
* Faster than Java BigDecimal
* Compares relatively well in speed against gmpy
To be clear, I would not want to _replace_ decimal.py. Rather I'd like to see a cdecimal module alongside decimal.py.
I know that ultimately there should be a PEP for module inclusions. The purpose of this post is to gauge interest. Specifically:
1. Are you generally in favour of a C decimal module in Python?
2. Would fastdec - after achieving full decimal.py compatibility - be a serious candidate?
3. Could I use this list to settle a couple of questions, or would perhaps a Python developer be willing to work with me to make it compatible? I'm asking this to avoid doing work that would not find acceptance afterwards.
Thanks,
Stefan Krah

Shouldn't this be on python-ideas?
S
On Oct 20, 2009, at 9:15 AM, Stefan Krah wrote:
Hi,
as some of you know, recently I've released an arbitrary precision C library for decimal arithmetic together with a Python module:
http://www.bytereef.org/libmpdec.html http://www.bytereef.org/fastdec.html
Both the library and the module have been tested extensively. Fastdec currently differs from decimal.py in a couple of ways that could be fixed. The license is AGPL, but if there is interest in integrating it into Python I'd release it under a Python-compatible license.
There have been several approaches towards getting C decimal arithmetic into Python:
http://bugs.python.org/issue2486
Fastdec follows Raymond Hettinger's suggestion to provide wrappers for an independent C implementation. Arguments in favour of fastdec are:
Complete implementation of Mike Cowlishaw's specification
C library can be tested independently
Redundant arithmetic module for tests against decimal.py
Faster than Java BigDecimal
Compares relatively well in speed against gmpy
To be clear, I would not want to _replace_ decimal.py. Rather I'd like to see a cdecimal module alongside decimal.py.
I know that ultimately there should be a PEP for module inclusions. The purpose of this post is to gauge interest. Specifically:
Are you generally in favour of a C decimal module in Python?
Would fastdec - after achieving full decimal.py compatibility - be
a serious candidate?
- Could I use this list to settle a couple of questions, or would
perhaps a Python developer be willing to work with me to make it compatible? I'm asking this to avoid doing work that would not find acceptance afterwards.
Thanks,
Stefan Krah
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ssteinerx%40gmail.com

ssteinerX@gmail.com ssteinerx@gmail.com wrote:
Shouldn't this be on python-ideas?
I found previous discussions about "Decimal in C" on python-dev, that's why used this list.
Stefan Krah

On Oct 20, 2009, at 9:43 AM, Stefan Krah wrote:
ssteinerX@gmail.com ssteinerx@gmail.com wrote:
Shouldn't this be on python-ideas?
I found previous discussions about "Decimal in C" on python-dev, that's why used this list.
python-ideas:
This list is to contain discussion of speculative language ideas for Python for possible inclusion into the language. If an idea gains traction it can then be discussed and honed to the point of becoming a solid proposal to put to either python-dev or python-3000 as appropriate.
I guess it's a fine line...and matter of opinion. No worries.
S

On Tue, 20 Oct 2009 at 09:55, ssteinerX@gmail.com wrote:
On Oct 20, 2009, at 9:43 AM, Stefan Krah wrote:
ssteinerX@gmail.com ssteinerx@gmail.com wrote:
Shouldn't this be on python-ideas?
I found previous discussions about "Decimal in C" on python-dev, that's why used this list.
python-ideas:
This list is to contain discussion of speculative language ideas for Python for possible inclusion into the language. If an idea gains traction it can then be discussed and honed to the point of becoming a solid proposal to put to either python-dev or python-3000 as appropriate.
I guess it's a fine line...and matter of opinion. No worries.
In this case it isn't a language idea under discussion, but the possible inclusion of an implementation of an idea (and moreover an idea that is already included in Python in another, less efficient, form).
(I'm expecting that both Mark Dickinson and Raymond Hettinger will comment on this thread eventually...)
--David (RDM)

2009/10/20 Stefan Krah stefan-usenet@bytereef.org:
Hi,
as some of you know, recently I've released an arbitrary precision C library for decimal arithmetic together with a Python module:
http://www.bytereef.org/libmpdec.html http://www.bytereef.org/fastdec.html
Both the library and the module have been tested extensively. Fastdec currently differs from decimal.py in a couple of ways that could be fixed. The license is AGPL, but if there is interest in integrating it into Python I'd release it under a Python-compatible license.
There have been several approaches towards getting C decimal arithmetic into Python:
http://bugs.python.org/issue2486
Fastdec follows Raymond Hettinger's suggestion to provide wrappers for an independent C implementation. Arguments in favour of fastdec are:
* Complete implementation of Mike Cowlishaw's specification
* C library can be tested independently
* Redundant arithmetic module for tests against decimal.py
* Faster than Java BigDecimal
* Compares relatively well in speed against gmpy
To be clear, I would not want to _replace_ decimal.py. Rather I'd like to see a cdecimal module alongside decimal.py.
Why? If it's 100% compatible with decimal.py, just replace it. All the user should see is improved speed. Let's not do another pickle/cpickle.
I know that ultimately there should be a PEP for module inclusions. The purpose of this post is to gauge interest. Specifically:
- Are you generally in favour of a C decimal module in Python?
Yes, although I have to admit my interest is fairly theoretical.
- Would fastdec - after achieving full decimal.py compatibility - be
a serious candidate?
I don't see why not, if it was 100% compatible with decimal.py
- Could I use this list to settle a couple of questions, or would perhaps
a Python developer be willing to work with me to make it compatible? I'm asking this to avoid doing work that would not find acceptance afterwards.
I can't help much here, but I'd prefer to see discussions on python-dev so I'm +1 on keeping discussions here.
Waiting eagerly to hear the experts' comments...
Paul.

On Tue, Oct 20, 2009 at 3:00 PM, Paul Moore p.f.moore@gmail.com wrote:
2009/10/20 Stefan Krah stefan-usenet@bytereef.org:
Hi,
as some of you know, recently I've released an arbitrary precision C library for decimal arithmetic together with a Python module:
http://www.bytereef.org/libmpdec.html http://www.bytereef.org/fastdec.html
Both the library and the module have been tested extensively. Fastdec currently differs from decimal.py in a couple of ways that could be fixed. The license is AGPL, but if there is interest in integrating it into Python I'd release it under a Python-compatible license.
There have been several approaches towards getting C decimal arithmetic into Python:
http://bugs.python.org/issue2486
Fastdec follows Raymond Hettinger's suggestion to provide wrappers for an independent C implementation. Arguments in favour of fastdec are:
* Complete implementation of Mike Cowlishaw's specification
* C library can be tested independently
* Redundant arithmetic module for tests against decimal.py
* Faster than Java BigDecimal
* Compares relatively well in speed against gmpy
To be clear, I would not want to _replace_ decimal.py. Rather I'd like to see a cdecimal module alongside decimal.py.
Why? If it's 100% compatible with decimal.py, just replace it. All the user should see is improved speed. Let's not do another pickle/cpickle.
For example other python implementations might decide to use python version as long as builtin version does not appear. Python versions are usually also better targets for jit than mixed versions. C level versions also usually have more bugs (just statistics), so some people might want to choose pure-python version.
In general - some people have some reasons.
Cheers, fijal

Maciej Fijalkowski wrote:
For example other python implementations might decide to use python version as long as builtin version does not appear. Python versions are usually also better targets for jit than mixed versions. C level versions also usually have more bugs (just statistics), so some people might want to choose pure-python version.
In general - some people have some reasons.
Although nobody has broken "sys.modules['_decimal'] = 0", so deliberately turning off optimisations is pretty easy if you really don't want them.
There's a reason we moved to implicit import of optimised versions in Py3k - we're unlikely to revert to the old way of doing things.
As far as decimal.py in particular goes, there are significant maintenance gains in keeping a lot of the non-performance critical context management code in pure Python. So we're likely to wait and see how much speed Mark can wring out of a simple C decimal coefficient object (that other implementations can also fairly easily provide natively) before looking seriously at a wholesale replacement of the module.
Cheers, Nick.

On Tue, Oct 20, 2009 at 2:15 PM, Stefan Krah stefan-usenet@bytereef.org wrote:
- Are you generally in favour of a C decimal module in Python?
I'm certainly interested in the general idea; whether I'd be in favour of replacing decimal.py with a particular C version would depend on a lot of factors, with code quality, interchangeability with the current decimal module, and maintainability by Python core developers high on the list. There's also the question of what IronPython and Jython would do.
- Would fastdec - after achieving full decimal.py compatibility - be
a serious candidate?
Definitely. As far as I know it's the only real candidate for a full C version of decimal right now. Other possibilities that I'm aware of:
* I think Raymond Hettinger is working on a C version of decimal. I'm not sure what its current status is. Raymond?
* There's a partially complete rewrite of decimal in C in the sandbox, dating from the Need for Speed sprint in 2006:
* Wrapping the decNumber library directly would make some sense, but I think licensing issues rule this out.
http://svn.python.org/view/sandbox/trunk/decimal-c/
Last time I looked at this it wasn't up to date with the decimal specification: I'm not sure that functions like exp, log and log10 are currently working. Georg Brandl might know better than I do.
- Could I use this list to settle a couple of questions, or would perhaps
a Python developer be willing to work with me to make it compatible? I'm asking this to avoid doing work that would not find acceptance afterwards.
I don't see why not. Working closely with one of the core developers on this sounds like a good idea, as well: if the code were to go into Python core, at least one (and preferably more than one) core dev should be familiar enough with it for maintenance. (BTW, I think that *if* fastdec went in it's likely you'd be granted commit privileges at some point.)
It's difficult to gauge the likelihood of eventual acceptance in advance, though. Maybe writing a PEP would be an efficient use of time in this regard? There are certainly some open issues (e.g., what to do with the existing Python module; what should other Python implementations do).
I think my biggest concern is maintenance: we'd be replacing 8500 lines of Python code in a single file, that several of the current core developers understand well, with 30000 (Stefan, is that about accurate?) lines of C in several files, that presumably none of the current core devs is familiar with right now. What happens when (e.g.,) the number-theoretic transform needs updating for one reason or another? Stefan, do you see yourself having time to spend on maintenance of this code for the forseeable future?
BTW, does anyone know the current SLOC count for py3k?
Mark

On Wed, Oct 21, 2009 at 11:37 AM, Nick Coghlan ncoghlan@gmail.com wrote:
As far as decimal.py in particular goes, there are significant maintenance gains in keeping a lot of the non-performance critical context management code in pure Python. So we're likely to wait and see how much speed Mark can wring out of a simple C decimal coefficient object
No need to wait for this. :-) I don't really expect to get much speed gain at all in normal, low-precision (= precisions less than 100 digits, say) use. Even doubling the speed would be way too much to hope for here.
The only real gain from a decimal integer coefficient would be fixing the asymptotics for high-precision calculations.
Mark

Mark Dickinson <dickinsm <at> gmail.com> writes:
There are certainly some open issues (e.g., what to do with the existing Python module; what should other Python implementations do).
The existing module could be kept as a fallback. Also, the test suite should be careful to test both implementations (like what is currently done for the io module).
BTW, does anyone know the current SLOC count for py3k?
Here you are, generated using David A. Wheeler's 'SLOCCount':
SLOC Directory SLOC-by-Language (Sorted) 261496 Lib python=261451,sh=45 186769 Modules ansic=173279,asm=9561,sh=3929 53258 Objects ansic=53258 40257 Python ansic=40229,python=28 27220 Tools python=27117,ansic=67,sh=36 18892 Demo python=18511,ansic=377,sh=4 9168 PC ansic=8465,python=703 5840 Include ansic=5840 5799 Parser ansic=3914,python=1885 3485 Misc lisp=2948,python=242,sh=185,ansic=110 3101 Doc python=2306,ansic=795 3030 Mac python=2138,objc=775,sh=109,ansic=8 1666 top_dir python=1140,ansic=286,sh=240 349 PCbuild python=279,ansic=70 337 build ansic=295,python=42 0 Grammar (none)
Totals grouped by language (dominant language first): python: 315842 (50.89%) ansic: 286993 (46.24%) asm: 9561 (1.54%) sh: 4548 (0.73%) lisp: 2948 (0.47%) objc: 775 (0.12%)
Total Physical Source Lines of Code (SLOC) = 620,667

On Wed, Oct 21, 2009 at 4:05 PM, Antoine Pitrou solipsis@pitrou.net wrote:
Mark Dickinson <dickinsm <at> gmail.com> writes:
BTW, does anyone know the current SLOC count for py3k?
Here you are, generated using David A. Wheeler's 'SLOCCount': [...]
Thanks, Antoine! With SLOCCount I can revise my earlier numbers, as well: Here's Stefan Krah's mpdecimal, version 0.80:
SLOC Directory SLOC-by-Language (Sorted) 21445 top_dir ansic=21267,sh=105,python=55,asm=18 6238 python python=6177,java=43,sh=18 1403 tests ansic=1356,sh=47 476 literature lisp=476 274 cmd ansic=274 11 tools sh=11 0 doc (none)
Totals grouped by language (dominant language first): ansic: 22897 (76.71%) python: 6232 (20.88%) lisp: 476 (1.59%) sh: 181 (0.61%) java: 43 (0.14%) asm: 18 (0.06%)
Lib/decimal.py:
SLOC Directory SLOC-by-Language (Sorted) 2636 tmp python=2636
Totals grouped by language (dominant language first): python: 2636 (100.00%)
So it looks like 2636 lines of Python versus 21000-ish lines of C.
Mark

Mark Dickinson schrieb:
- There's a partially complete rewrite of decimal in C in the sandbox,
dating from the Need for Speed sprint in 2006:
http://svn.python.org/view/sandbox/trunk/decimal-c/
Last time I looked at this it wasn't up to date with the decimal specification: I'm not sure that functions like exp, log and log10 are currently working. Georg Brandl might know better than I do.
I haven't touched that code since the sprint, but a student named Mateusz Rukowicz worked on it for a past Summer of Code, I think. I never heard of the outcome of that particular project.
Georg

Mark Dickinson dickinsm@gmail.com wrote:
Thanks, Antoine! With SLOCCount I can revise my earlier numbers, as well: Here's Stefan Krah's mpdecimal, version 0.80:
SLOC Directory SLOC-by-Language (Sorted) 21445 top_dir ansic=21267,sh=105,python=55,asm=18 6238 python python=6177,java=43,sh=18 1403 tests ansic=1356,sh=47 476 literature lisp=476 274 cmd ansic=274 11 tools sh=11 0 doc (none)
I would say that the relevant code is less than that: The module code is counted twice (fastdec2.c, fastdec3.c), and sloccount counts the inline functions in the mpdecimal*.h header files. So, after removing fastdec3.c, mpdecimal32.h, mpdecimal32vc.h, mpdecimal64vc.h, I get:
SLOC Directory SLOC-by-Language (Sorted) 13702 top_dir ansic=13524,sh=105,python=55,asm=18 6238 python python=6177,java=43,sh=18 1403 tests ansic=1356,sh=47 476 literature lisp=476 274 cmd ansic=274 11 tools sh=11 0 doc (none)
Totals grouped by language (dominant language first): ansic: 15154 (68.56%) python: 6232 (28.19%) lisp: 476 (2.15%) sh: 181 (0.82%) java: 43 (0.19%) asm: 18 (0.08%)
Therefore, my estimate is between 12660 and 13702 lines, depending on whether to count the remaining mpdecimal64.h (Most inline functions in this header file are signaling wrappers around the quiet functions and are not necessary for the module).
If one takes out all library functions that the module does not use, I'm sure it could be condensed to ~11500 lines.
This is comparable to the expat directory in Modules/:
SLOC Directory SLOC-by-Language (Sorted) 11406 expat ansic=11406
Stefan Krah

Mark Dickinson dickinsm@gmail.com wrote:
- Would fastdec - after achieving full decimal.py compatibility - be
a serious candidate?
Definitely. As far as I know it's the only real candidate for a full C version of decimal right now. Other possibilities that I'm aware of:
- I think Raymond Hettinger is working on a C version of decimal.
I'm not sure what its current status is. Raymond?
- There's a partially complete rewrite of decimal in C in the sandbox,
dating from the Need for Speed sprint in 2006:
- Wrapping the decNumber library directly would make some sense,
but I think licensing issues rule this out.
I hope I'm not spreading misinformation, but I see two problems with the use of decNumber as an arbitrary precision library. First, the algorithms are not optimized for large numbers. Then, you have to define the maximum length of the coefficient *at compile time* by setting DECNUMDIGITS.
It's difficult to gauge the likelihood of eventual acceptance in advance, though. Maybe writing a PEP would be an efficient use of time in this regard? There are certainly some open issues (e.g., what to do with the existing Python module; what should other Python implementations do).
Good, I'll do that. I mainly wanted to find out if there is fundamental opposition against C-decimal in the standard library, but this does not seem to be the case.
I think my biggest concern is maintenance: we'd be replacing 8500 lines of Python code in a single file, that several of the current core developers understand well, with 30000 (Stefan, is that about accurate?) lines of C in several files, that presumably none of the current core devs is familiar with right now.
Ok, I think we could say that the sloc-counts are around 2600 lines of Python vs. around 11500-12500 lines of C.
What happens when (e.g.,) the number-theoretic transform needs updating for one reason or another? Stefan, do you see yourself having time to spend on maintenance of this code for the forseeable future?
Yes. Generally speaking, I expect the code to be low-maintenance. It would be even easier if all major compilers supported exact width C99 types and __uint128_t, but this is wishful thinking.
Stefan Krah

Stefan Krah wrote:
Mark Dickinson wrote:
I think my biggest concern is maintenance: we'd be replacing 8500 lines of Python code in a single file, that several of the current core developers understand well, with 30000 (Stefan, is that about accurate?) lines of C in several files, that presumably none of the current core devs is familiar with right now.
Ok, I think we could say that the sloc-counts are around 2600 lines of Python vs. around 11500-12500 lines of C.
If maintenance is an issue, I would actually propose to compile the existing decimal.py using Cython (works with a few trivial modifications), add a few type decorators at the hot spots to make them really fast, and then use that as both the Python implementation and fast binary module.
Stefan
participants (10)
-
Antoine Pitrou
-
Georg Brandl
-
Maciej Fijalkowski
-
Mark Dickinson
-
Nick Coghlan
-
Paul Moore
-
R. David Murray
-
ssteinerX@gmail.com
-
Stefan Behnel
-
Stefan Krah