Challenge: optimizing isqrt
stephen_tucker at sil.org
Wed Nov 26 08:57:48 CET 2014
Another _possible_ performance improvement that is staring us in the face
is that 2*b could be replaced with b<<1. Given that b+b (an earlier
suggestion of mine) involves two table look-ups for b, whereas b<<1 only
involves one, it seems that the scope here for improvement is significant.
By the way, I hope this post is not "top-posted" as my previous one was.
Apologies for that - I am new to this kind of thing.
On Wed, Nov 26, 2014 at 12:46 AM, Dave Angel <davea at davea.name> wrote:
> On 11/25/2014 02:31 PM, Serhiy Storchaka wrote:
>> п'ятниця, 21-лис-2014 08:15:57 ви написали:
>>> This looks very good indeed. As a matter of interest, is there any
>>> particular reason you have used 2*b instead of b+b? Might b+b be faster
>>> than b*2?
>> Yes, it is slightly faster, but the effect is indiscernible in total
>> time. But
>> there is not harm to use b+b.
>> Also, in various lines, you use //2. Would >>1 be quicker? On reflection,
>>> perhaps you have had to use //2 because >>1 cannot be used in those
>> I thought this effect would be insignificant too. But actually it is
>> (about 10% for some input). Thanks, this optimization is worth to be
> Unfortunately, for many values, the version of the function with >>1 is
> slower. It's only when the argument is bigger than 10**40 that it's as
> much as 1% faster. But it's true that for really large values, it's
> A quick test shows that 3.3 is about 20% faster for both these functions
> than 2.7.
> My oversight earlier was assuming a native type. Once you get into
> "longs" which aren't supported by the processor, the shift will likely
> become much faster than divide.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list