How to get decimal form of largest known prime?

David Fraser davidf at sjsoft.com
Sun Jun 13 08:46:35 CEST 2004


Tim Peters wrote:
> [Grégoire Dooms]
> 
>>If speed is the matter, go for C:
>>
>>with the GNU MP library use
>>void mpz_pow_ui (mpz_t rop, mpz_t base, unsigned long int exp) and
>>char * mpz_get_str (char *str, int base, mpz_t op)
> 
> 
> There are some GMP wrappers for Python.  Using Marc-Andre Lemburg's mxNumber
> wrapper (which is maximally convenient for me on Windows, since it includes
> a precompiled-for-Windows GMP), the OP's task is spelled like so:
> 
> 
>>>>from mx.Number import *
>>>>Integer(2)**24036583 - 1
> 
> 
> That does call mpz_pow_ui() and mpz_get_str() (with base=10) under the
> covers.
> 
> Since I typed that in, the process has accumulated 59 CPU minutes, with no
> output yet.
> 
> Curiously,
> 
> 
>>>>x = Integer(2)**24036583 - 1
> 
> 
> consumes 112 CPU seconds on its own, while the straight Python
> 
> 
>>>>x = 2**24036583 - 1
> 
> 
> consumes only 2 CPU seconds -- so Python is 50+ times faster than this GMP
> for the computational part.  Python's Karatsuba multiplication gets enormous
> benefit out of special-casing zeroes in this particular case (all the
> squarings when computing the power see an input whose least-significant half
> is entirely zero bits, so huge mounds of needless work get skipped).
> 
> The fastest pure-Python solution I've found consumed about 20 CPU minutes to
> do the whole job, so is at least 3X as fast as the GMP here (which is still
> running as I type this, over an hour of CPU time and climbing).  For that, I
> wrote my own multiplication and exponentiation routines in Python, using
> digits in base 10**5000.  Computing in a form of decimal from the start
> makes "conversion to decimal" a speedy operation.
> 
> 
>>I did a small contest about the decimal repr of 2**1000000 a a couple
>>years ago. A few languages were compared on a time-to-result basis.
>>Python won with 7 minutes (10 sec devel, 6:50 comput.) and C + GMP was
>>second: 15 min devel(including reading the doc) + a few sec of comput. I
>>bet you can expect a 30-100 fold speedup using C + GMP compared to python
>>-c 'print 2**24036583 - 1'
> 
> 
> On the box I'm using here, the straight Python 2.3.4:
> 
> 
>>>>2**1000000
> 
> 
> consumed 80 CPU seconds (which includes decimal display), and the mxNumber
> version:
> 
> 
>>>>from mx.Number import *
>>>>Integer(2)**1000000
> 
> 
> consumed 60 CPU seconds.  I expect Python 2.3.4 is faster at this stuff than
> the version you used (in particular, recent Pythons use base 10000
> internally when converting to decimal, instead of base 10; it's still
> quadratic-time, but the constant factor was slashed).  The box is a 3.2GHz
> P4, which is probably also faster than the box you used.  It could also be
> that the version of GMP shipped with mxNumber isn't optimally compiled for
> this box, and/or getting old.
> 
> Lessons to take home <wink>:
> 
> + The speed of bigint operations can be data-dependent in implementation-
>   specific ways (Python's native pow() ran circles around GMP's for this
>   particular data).
> 
> + That changes over time (recent Pythons are much faster at some of these
>   tasks than the one you used).
> 
> + A better algorithm is always the better answer (computing in decimal
>   from the start allows pure Python to run circles around GMP computing
>   in binary then forced to do a non-trivial 24-million bit base
>   conversion).

Is it possibile to have a better algorithm for binary to base 10 
conversion, or is that limited to quadratic time? Any links to 
papers/discussions on this?

David



More information about the Python-list mailing list