# strange exponentiation performance

Jeff Davis nospam at nospam.com
Mon Nov 24 07:35:23 CET 2003

```> I'm taking the liberty of rewriting your functions to stop trying to squash
> as much as they can on each line.  It's important to do that if you want to
> modify the code quickly when experimenting.

Fair enough.

>
> First you rediscovered the "left to right" binary exponentiation algorithm:
>

Interesting.

<snip>

> When you tried to remove recursion from it, you rediscovered the "right to
> left" binary exponentiation algorithm:

Interesting. I suspected it wasn't a perfect unwrapping of the recursion,
because I didn't take the time to examine it carefully. Classic case of a
wonderful idea I had at 3:30 in the morning :)

<snip>

> That's close to what Python's builtin ** does, but is missing an important
> speed trick (explained later).  Note that h and f aren't really the *same*
> algorithm!  f squares the current value of b on every trip through the loop.
> h does not.  If you print out all the inputs to all the multiplications,
> you'll see that they're not at all the same.
>

Also interesting.

<snip>

>
> Your box is faster than mine.  Here under 2.3.2 on Windows:
>

That means I win, right ;)

<snip>

> exponentiation is a very hard problem; Knuth Volume 2 has an extensive
> discussion of this ("Evaluation of Powers"); another algorithm based on
> first factoring the exponent (expressing it as a product of primes) is
> better "on average" than either binary method, but sometimes loses to
> them.

Ahh... I'm on vol. 1. I saw that it would be a while before I made it to
page 100, so I actually loaned out vol. 2. I'll be getting that back now I
guess :)

<snip>

>
> def q(b, e):
>     if e == 0:
>         return 1
>     if e == 1:
>         return b
>     e2, numbits = e, 0
>     while e2:
>         e2 >>= 1
>         numbits += 1
>     assert numbits >= 2
>     result = b
>     mask = 1L << (numbits - 2)
>     for dummy in range(numbits - 1):
>         result *= result
>             result *= b
>     return result
>

Now I understand what you mean about the right-left and left-right
versions. I only had a vague understanding before.

> That's bound to be a little faster than h on most inputs, because it
> also optimizes away multiplications by 1 (well, unless b happens to be
> 1).  Let's see:

Might as well eliminate the multiplication by 1, for some reason I
assumed that could be optimized out somehow by the computer.

<snip>
> Yup, those matter, but not a whale of a lot.  The savings in
skipping
> multiplications by 1 is proportional to the number of 0 bits in the
> exponent.  What *didn't* matter is whether it's recursive or iterative.

Yeah, I guess it's always the case: algorithm first, then optimize. I know
the rule, and I follow it when being paid (so as not to waste anyone's
money). When doing something more academic I always feel like I need to
break it down to the most simple operations so I understand what's going
on. To me, recursion hides what's going on to an extent, and I felt like I
didn't entirely understand the algorithm.

<snip>

>> If my algorithm h() is better, why can't ** use a quick test to change
>> algorithms based on inputs? Or is mine better in all cases?
>
> See Knuth <wink>.
>

Uh-oh, another problem that's more difficult than meets the eye. I'll be
back with more questions in a few years I guess ;)

Thanks,
Jeff

```