[issue7652] Merge C version of decimal into py3k.

Jim Jewett report at bugs.python.org
Wed Mar 7 05:04:03 CET 2012


Jim Jewett <jimjjewett at gmail.com> added the comment:

On Tue, Mar 6, 2012 at 3:07 PM, Stefan Krah

> Jim Jewett <report at bugs.python.org> wrote:
>> (1)  I think this module would benefit greatly from a map explaining
>>      what each file does, and perhaps from some reorganization.

> Just MAP.txt in the top level directory?

That should work.  There may be better names, but I can't think of any just now.

> Amaury suggested moving
> the library into a subdirectory. I'm not sure about that. The
> library would be out of sight, but is that a good thing?

cdecimal certainly needs a subdirectory similar to those for _io,
_ctypes, _multiprocessing, and _sqlite.

It *may* make sense to move some of the subdirectories around.  (On
the other hand, it may not; if the tests in Lib/test/ end up
delegating back, that is probably OK.)

I believe it would be helpful to move non-code (project files, etc) to
separate directories.

Whether you need *additional* subdirectories within _cdecimal to
subcategorize the .c and .h files, I'm not sure -- because I didn't
get in deep enough to know what they should be.  If the categorization
let people focus on the core, that would be helpful, but it wasn't
clear to me which files were part of the exported API and which were
implementation details.  Are there are clear distinctions (type
info/python bindings/basic arithmetic/advanced
algorithms/internal-use-only/???)

>> As best I can yet tell, there are about ~130 files, over a dozen directories,
>> but the only ones that directly affect the implementation are a subset (~33)
>> of the *.c and *h files in Modules/_decimal/ (and not subdirectories).

> Indeed the top level directory contains _decimal.c and all files
> from libmpdec. Almost all files are required.

Would it make sense to integrate only cdecimal, and to treat libmpdec
as an external dependency that (usually) gets updated with each Python
feature release, the way that sqlite is?

> The three subdirectories contain:

>  tests/       ->  library tests
>  python/      ->  extended module tests

I would really expect that to still be under tests, and I would expect
a directory called python to contain code written in python, or at
least python bindings.

>  literature/  ->  pointers to papers and explanations for trickier algorithms.

>> Even files that do affect the implementation, such as mpdecimal.c,
>> also seem to have functions thrown in just for testing small pieces
>> of functionality, such as Newton Division.

> That is correct. They were deliberately added in that place because they rely
> on a couple of inline functions and I have a policy of testing the exact code
> that the original function relies on.

How important is it that these functions be inline?

Would it at least be OK to wrap them in stubs for exporting, so that
the test logic could be places with the others tests?  (I worry that
some tests may stop getting run if someone else modifies the build
process and doesn't notice the unusual location.)

> The alternative is to extract all functions needed, move them to the test
> directory and hope that the code doesn't get stale.

I agree that copying is bad.

I'll trust your judgement on the need for inline.  But given:

    ALWAYS_INLINE int
    mpd_word_digits(mpd_uint_t word)

I don't see anything wrong with exporting:

    int
    _testhelp_mpd_word_digits(mpd_uint_t word) {
        return mpd_word_digits(word);
    }

>> Turkish un/dotted-i problem when lowercasing -- but why is a decimal
>> library even worried about casing?

> "Infinity", "InFinItY", "iNF" are all allowed by the specification.

OK; so is io.c part of the library, or part of the python binding?

Given that this is targeted at 3.3 or later, would it make sense to
either use casefolding, or check the kind?  (If it isn't ASCII, it
can't be the word "INF".)

Are there only a certain number of strings that will ever matter, such
as INF, NAN, and INFINITY, so that a case statement would work?
tolower() with an extra check for the turkish undotted lower case i?
What you have may well be the best compromise, but it bothers me to
see text processing tools redone in a numeric type -- particularly
without knowing why they are needed.

>> (2)  Is assembly allowed?

> I was under the assumption that it is allowed:

I'm honestly not sure, but I think that was one of the reasons
stackless was never integrated.

>> If not, please make it clear that vcdiv64.asm is just an optional speedup,
>> and that the code doesn't rely upon it.

> No code relies on asm. Assembly is only used for the double word mul/divmod
> primitives in typearith.h and the Pentium PRO modular multiplication in
> umodarith.h, and there are ANSI versions for everything.

Good enough, though I would rather see that as a comment near the assembly.

>> (3) Are there parts of this library that provide functionality NOT
>>     in the current decimal library?  If so, this should be at least
>>     documented, and perhaps either removed or exposed.

> Apart from mpsignal.c (see above), there are probably a couple of things
> in the header files like mpd_invroot(). _mpd_qinvroot() from mpdecimal.c
> *is* needed because the square root is calculated in terms of the
> inverse square root.

> Are these (probably) minor instances of additional functionality a
> big problem for you? Because for me it would be a hassle to maintain
> diverging versions of libmpdec and the Python version of libmpdec.

I'm not worried about the header files.  I am worried about what is
exposed to python, but just documenting it (docstrings and the module
.rst) may be OK.

But I'm also worried that there may be fair amounts of code that are
effectively dead after the "remove any names not in decimal.py"
importing trick.  If so, I would at least like that in some sort of
#ifdef, so that people don't spend too much time trying to make sense
of it.

That said, if you plan to maintain an external libmpdec regardless of
what happens, then it makes even more sense to integrate (at most) the
bindings, and to treat libmpdec as an external dependency.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7652>
_______________________________________


More information about the Python-bugs-list mailing list