Challenge: optimizing isqrt

Chris Angelico rosuav at gmail.com
Sat Nov 1 03:12:54 CET 2014


On Sat, Nov 1, 2014 at 12:29 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Q1: What is the largest value of M, such that
>
> all(isqrt(i) == int(math.sqrt(n)) for n in range(M))
>
> returns True?
>
> I have done a brute force test, and in nine hours confirmed that M is at
> least 7627926244. That took nine hours, and I expect that a brute force
> test of every int representable as a float would take *months* of
> processing time.

Here's something that should be faster than pure brute-force:

import math
for i in range(2**34): # 2**34 should take about nine hours
    sqrt = int(math.sqrt(i))
    if sqrt * sqrt > i:
        print("Too high at %d" % i)
        break
    if (sqrt+1) * (sqrt+1) <= i:
        print("Too low at %d" % i)
        break

There are two ways int(math.sqrt(i)) can be wrong: it can return a
number that's too high, or one that's too low. If it's too high,
squaring it (btw, int*int seems to be faster than int**2) will produce
a number higher than the original. If the result was too low, then the
next number up would have been within the correct bounds. There's no
need to actually calculate isqrt here.

I was able to probe up to 2**29 in seven minutes on my computer
(allocating one core, CPython 3.5 from trunk). If it's linear, going
as far as 2**34 should take no more than the nine hours you spent. I
expect time will be a bit worse than linear (working with small
numbers is marginally faster than working with big numbers), but if
it's at least reasonably close, probing as far as 2**53 would take
81555 days. So, uhh, a brute force by your method is going to take a
lot more than months. (And 2**53 isn't technically where ints stop
being representable, but I'd use that as the final cut-off, as it's
where *all* ints stop being representable. In any case, there's a
failure at 2**54-1, so there's no need to go past there.)

Hmm. The speed difference between your brute force and mine isn't all
that much. I'm estimating maybe doing at best twice as much in the
same time. So you're probably already using something similar to the
above, and this isn't adding much to the discussion. :(

ChrisA



More information about the Python-list mailing list