How to make Python run as fast (or faster) than Julia
Jack Fearnley
jack at alcor.concordia.ca
Thu Feb 22 14:55:08 EST 2018
On Fri, 23 Feb 2018 06:07:44 +1100, Chris Angelico wrote:
> On Fri, Feb 23, 2018 at 3:00 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Thu, 22 Feb 2018 12:03:09 +0000, bartc wrote:
>>> Here's another speed-up I found myself, although it was only 50 times
>>> faster, not 17,000: just write the code in C, and call it via
>>> os.system("fib.exe").
>>
>> Did you include the time taken to capture the text output from stdout,
>> parse it, and convert to an int?
>
> Relatively trivial compared to the overhead of invoking a subprocess
> THROUGH A SHELL. On Windows, that has to invoke cmd.exe; on Unix-like
> systems, most likely /bin/sh. And then that has to search the system
> path (maybe that doesn't happen on his Windows system, as presumably
> it's finding the executable on the cwd check). That's file system
> operations, and a lot of them.
>
>> Did your C code support numbers greater than 2**64? My fibonacci
>> function can calculate the 10,000th Fibonacci number in a blink of the
>> eye:
>>
>> py> fibi(10000)
>> 336447648764317832666216120051075...976171121233066073310059947366875
>>
>> Output truncated for brevity, it is a 2090-digit number.
>>
>> Admittedly it did take about a minute and a half to generate all 208988
>> digits of the one millionth Fibonacci number, but the performance is
>> fast enough for my needs.
>
> Yeah, but if you stick that into a variable, you can separately time the
> calculation from the stringification. Using the exact code from the
> page:
>
> def fib_seq(n):
> if n < 2:
> return n
> a,b = 1,0 for i in range(n-1):
> a,b = a+b,a
> return a
>
>>>> time.time(); x = fib_seq(1000000); time.time()
> 1519325946.7880309 1519325953.773382
>
> That's seven seconds to calculate the number. How big a number is it?
>
>>>> x.bit_length()
> 694241
>>>> math.log10(x)
> 208987.29076497658
>
> Both take effectively zero time. Dividing that number by 10 that many
> times is what takes all the processing. (Stringifying a number basically
> consists of div-mod to trim off the last digit, keep going until you run
> out of number. Lots of division.)
>
> So actually calculating the millionth Fibonacci number can be done
> fairly quickly; figuring out its first digits would be hard, but we
> don't actually need that. Though we could probably get a decent
> estimate:
>
>>>> log = math.log10(x)
>>>> print(10**(log - int(log)), "e +", int(log))
> 1.9532821287892859 e + 208987
>
> I wouldn't trust all of those digits, but that's a fast way to get an
> idea of the scale of a number. Doing that rather than printing the whole
> number means you actually see the performance of *calculation*.
I realize that this thread is about benchmarking and not really about
generating fibonacci numbers, but I hope nobody is using this code to
generate them on a 'production' basis,
Fibonacci numbers, any linearly recursive sequence for that matter, can
be generated in log time.
GP/Pari on my Intel I7 computes fibonacci(100000) in less than 1 ms,
fibonacci(1000000) in 5ms, fibonacci(10000000) in 59 ms and finally
fibonacci(100000000) in 733 ms.
This would obviously be slower in Python but by a constant factor.
Best regards,
Jack Fearnley
>
> ChrisA
More information about the Python-list
mailing list