[Python-Dev] 64 bit units in PyLong

Wed Jul 5 16:18:05 EDT 2017

On Wed, Jul 5, 2017 at 12:05 PM Mark Dickinson <dickinsm at gmail.com> wrote:

> On Mon, Jul 3, 2017 at 5:52 AM, Siyuan Ren <netheril96 at gmail.com> wrote:
> > The current PyLong implementation represents arbitrary precision
> integers in
> > units of 15 or 30 bits. I presume the purpose is to avoid overflow in
> > addition , subtraction and multiplication. But compilers these days offer
> > intrinsics that allow one to access the overflow flag, and to obtain the
> > result of 64 bit multiplication as a 128 bit number. Or at least on
> x86-64,
> > which is the dominant platform.  Any reason why it is not done?
>
> Portability matters, so any use of these intrinsics would likely also
> have to be accompanied by fallback code that doesn't depend on them,
> as well as some buildsystem complexity to figure out whether those
> intrinsics are supported or not. And then the Objects/longobject.c
> would suffer in terms of simplicity and readability, so there would
> have to be some clear gains to offset that. Note that the typical
> Python workload does not involve thousand-digit integers: what would
> matter would be performance of smaller integers, and it seems
> conceivable that 64-bit limbs would speed up those operations simply
> because so many more integers would become single-limb and so there
> would be more opportunities to take fast paths, but there would need
> to be benchmarks demonstrating that.
>
> Oh, and you'd have to rewrite the power algorithm, which currently
> depends on the size of a limb in bytes being a multiple of 5. :-)
>
> --
> Mark
>

When I pushed to get us to adopt 30-bit digits instead of 15-bit digits, I
hoped it could also happen on 32-bit x86 builds as the hardware has fast
32bit multiply -> 64 result support. But it made the configuration more
difficult and IIRC would've increase the memory use of the PyLong type for
single digit numbers on 32-bit platforms so we settled on moving to 30 bits
only for 64-bit platforms (which were obviously going to become the norm).

A reasonable digit size for hardware that supports 128bit results of 64bit
multiplies would be 60 bits (to keep our multiple of 5 bits logic
working).  But I doubt you will see any notable performance gains in
practical applications by doing so.  Sure, numbers in the 1-4 billion range
are slightly less efficient today, but those are not likely to appear in
hot loops in a typical application.  Microbenchmarks alone should not be
used to make this decision.

remember: We had native integer support in Python 1 & 2 via the old PyInt
type. In Python 3 we ditched that in favor of PyLong everywhere. This was a
performance hit for the sake of proper high level language simplicity.
We've already regained all of that since then in other areas of the
interpreter.

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20170705/efefda72/attachment.html>