Working with the set of real numbers
Oscar Benjamin
oscar.j.benjamin at gmail.com
Thu Mar 6 13:27:16 CET 2014
On 5 March 2014 12:57, Dave Angel <davea at davea.name> wrote:
> Oscar Benjamin <oscar.j.benjamin at gmail.com> Wrote in message:
>> On 4 March 2014 23:20, Dave Angel <davea at davea.name> wrote:
>>>
>>> On the assumption that division by 2 is very fast, and that a
>>> general multiply isn't too bad, you could improve on Newton by
>>> observing that the slope is 2.
>>>
>>> err = n - guess * guess
>>> guess += err/2
>>
>> I gues you mean like this:
>>
>> def sqrt(n):
>> err = guess = 1
>> while err > 1e-10:
>> err = n - guess * guess
>> guess += err/2
>> return guess
>>
>> This requires log2(10)*N iterations to get N digits.
>
> No idea how you came up with that, but I see an error in my
> stated algorithm, which does surely penalize it. The slope isn't
> 2, but 2x. So the line should have been
> guess += err/(2*guess)
Ah, now I understand. This is the same algorithm that Chris posted
which in turn is the same as the one that I had previously posted: all
three are just using Newton's method.
Newton's method uses the slope argument to find a root of f(x) i.e. an
x0 such that f(x0) = 0. We start with a guess x[n]. We then find the
value of the function f(x[n]) and slope fp(x[n]) and extrapolating to
the point where f(x) hits the x-axis we find an improved estimate with
x[n+1] = x[n] - f(x[n]) / fp(x[n])
In our case f(x) = x**2 - y and so fp(x) = 2*x which gives
x[n+1] = x[n] - (x[n]**2 - y) / (2*x[n])
and after rearranging we can express this as
x[n+1] = (x[n] + y/x[n]) / 2.
So my loop
while x ** 2 - y > x * eps:
x = (x + y/x) / 2
and Chris' loop:
while abs(guess1-guess2) > epsilon:
guess1 = n/guess2
guess2 = (guess1 + guess2)/2
and now your loop
while err > 1e-10:
err = n - guess * guess
guess += err/(2 * guess)
are all performing the same basic algorithm. The only significant
difference is that mine tests for a relative error condition where as
the other two test for absolute error. This means that it will still
converge to an answer with the correct precision even when the root is
large e.g. sqrt(1e100). The other two are susceptible to infinite
loops in this case.
> Now if you stop the loop after 3 iterations (or use some other
> approach to get a low-precision estimate, then you can calculate
> scale = 1/(2*estimate)
>
> and then for remaining iterations,
> guess += err *scale
Fair enough. That's a reasonable suggestion for improvement. This is
often known as the "modified Newton's method". For lack of a better
reference see here:
http://math.stackexchange.com/questions/645088/modified-newtons-method
Using an approximate slope to avoid division can be a significant
optimisation. This is especially true when using the multidimensional
generalisation of Newton's method since in this case the division is
replaced with solving a system of linear equations (vastly more
expensive). However as noted in that stackexchange answer the use of
an approximate slope prevents quadratic convergence so in the
asymptotic case where we want large numbers of digits it can be much
slower.
Actually it's worse than that since we are guaranteed to converge to
an incorrect answer. The update step x = (x + y/x) / 2 will converge
when x^2 = y but we replace it with x = (x + y/x0)/2 where x0 is near
to the true root. This will converge when x == (x + y/x0)/2 i.e. when
x = y/x0 which is not the correct answer (and can be computed without
a loop in any case). It's possible to fix that by alternating between
steps where we improve our estimate of the slope and steps where we
use that estimate for refinement but I don't think that the advantage
of not using division counts against the loss of quadratic convergence
once we get to more than 10s of digits.
You can compare them yourself with this code:
def sqrt1(y, prec=1000):
'''Solve x**2 = y'''
assert y > 0
eps = D(10) ** -(prec + 5)
x = D(y)
with localcontext() as ctx:
ctx.prec = prec + 10
while abs(x ** 2 - y) > x * eps:
x = (x + y/x) / 2
return x
def sqrt2(y, prec=1000):
'''Solve x**2 = y'''
assert y > 0
eps = D(10) ** -(prec + 5)
x = D(y)
with localcontext() as ctx:
ctx.prec = prec + 10
for n in range(7):
x = (x + y/x) / 2
x0r = 1 / x
while abs(x ** 2 - y) > x * eps:
err = x**2 - y
err2 = x - x0r * y
x = (x + x0r * y) / 2
return x
sqrt2(2) goes into an infinite loop at an error level of ~1e-100 (the
exact point where this happens depends on how many warmup iterations
it uses).
[snip]
>>
>> This method works out much slower than Newton with division at 10000
>> digits: 40s (based on a single trial) vs 80ms (timeit result).
>
> Well considering you did not special-case the divide by 2, I'm not
> surprised it's slower.
Multiplying by D('0.5') instead of dividing by 2 makes no measurable
difference to the timings. I don't know whether that's because it's
already special-cased by Decimal or just that the performance costs
don't match up in the same way for the multiprecision algorithms (when
the divisor is a single digit number).
Oscar
More information about the Python-list
mailing list