How to make Python run as fast (or faster) than Julia

Thu Feb 22 14:55:08 EST 2018

On Fri, 23 Feb 2018 06:07:44 +1100, Chris Angelico wrote:

> On Fri, Feb 23, 2018 at 3:00 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Thu, 22 Feb 2018 12:03:09 +0000, bartc wrote:
>>> Here's another speed-up I found myself, although it was only 50 times
>>> faster, not 17,000: just write the code in C, and call it via
>>> os.system("fib.exe").
>>
>> Did you include the time taken to capture the text output from stdout,
>> parse it, and convert to an int?
> 
> Relatively trivial compared to the overhead of invoking a subprocess
> THROUGH A SHELL. On Windows, that has to invoke cmd.exe; on Unix-like
> systems, most likely /bin/sh. And then that has to search the system
> path (maybe that doesn't happen on his Windows system, as presumably
> it's finding the executable on the cwd check). That's file system
> operations, and a lot of them.
> 
>> Did your C code support numbers greater than 2**64? My fibonacci
>> function can calculate the 10,000th Fibonacci number in a blink of the
>> eye:
>>
>> py> fibi(10000)
>> 336447648764317832666216120051075...976171121233066073310059947366875
>>
>> Output truncated for brevity, it is a 2090-digit number.
>>
>> Admittedly it did take about a minute and a half to generate all 208988
>> digits of the one millionth Fibonacci number, but the performance is
>> fast enough for my needs.
> 
> Yeah, but if you stick that into a variable, you can separately time the
> calculation from the stringification. Using the exact code from the
> page:
> 
> def fib_seq(n):
>     if n < 2:
>         return n
>     a,b = 1,0 for i in range(n-1):
>         a,b = a+b,a
>     return a
> 
>>>> time.time(); x = fib_seq(1000000); time.time()
> 1519325946.7880309 1519325953.773382
> 
> That's seven seconds to calculate the number. How big a number is it?
> 
>>>> x.bit_length()
> 694241
>>>> math.log10(x)
> 208987.29076497658
> 
> Both take effectively zero time. Dividing that number by 10 that many
> times is what takes all the processing. (Stringifying a number basically
> consists of div-mod to trim off the last digit, keep going until you run
> out of number. Lots of division.)
> 
> So actually calculating the millionth Fibonacci number can be done
> fairly quickly; figuring out its first digits would be hard, but we
> don't actually need that. Though we could probably get a decent
> estimate:
> 
>>>> log = math.log10(x)
>>>> print(10**(log - int(log)), "e +", int(log))
> 1.9532821287892859 e + 208987
> 
> I wouldn't trust all of those digits, but that's a fast way to get an
> idea of the scale of a number. Doing that rather than printing the whole
> number means you actually see the performance of *calculation*.

I realize that this thread is about benchmarking and not really about 
generating fibonacci numbers, but I hope nobody is using this code to 
generate them on a 'production' basis,

Fibonacci numbers, any linearly recursive sequence for that matter, can 
be generated in log time.

GP/Pari on my Intel I7 computes fibonacci(100000) in less than 1 ms, 
fibonacci(1000000) in 5ms, fibonacci(10000000) in 59 ms and finally 
fibonacci(100000000) in 733 ms.

This would obviously be slower in Python but by a constant factor.

Best regards,
             Jack Fearnley

> 
> ChrisA