
Sent to Carl only by mistake, I'm still getting the hang of this newfangled email thing... Carl said
divmod in principle can be done in O(multiplication log* n), which for large numbers can be O(n log n). I don't know whether py*'s implemention does this. How do you determine where the bottlenecks are?
For what it's worth, the time for str(long) / time for recursive algorithm does decrease steadily for increasing input lengths. But it's not n^2 / n log n. Perhaps we need to implement a faster divmod? Can I assume that bit_length() is O(1)? I checked Knuth last night, he doesn't have anything to say in the main text, but he says in exercises: (II.4.4) 14. [M27] (A. Schonhage.) The text's method of converting multiple-precision integers requires an execution time of order n^2 to convert an n-place integer, when n is large. Show that it is possible to convert n-digit decimal integers into binary notation in O(M(n)logn) steps, where M(n) is an upper bound on the number of steps needed to multiply n-bit binary numbers that satisfies the “smoothness condition” M(2n) >= 2M(n). 15. [M47] Can the upper bound on the time to convert large integers, given in exercise 14, be substantially lowered? which fits my intuition. I'm fairly sure that my (and bokr's) algorithm is O(M(n)logn). It also suggests I'm wrong about a linear time algorithm (though the question doesn't state either way, but given it's M47 hardness, I'm probably not going to be able to whip this up tonight :) Lloyd Allison observed that changing any bit can affect any digits, which makes beating n log n less likely. One of the main reasons I use python is because it lets me concentrate on the higher level algorithms (like haskell), but it is pragmatic about the way we tend to write programs (unlike haskell :). I doubt I could have written that algorithm in C before breakfast (as I did for the python version). But to the main point: is it fair for people to compare code which doesn't get the benefit of pypy? Yes it is. Because the majority of code out there today is going to have C calls. Sure, pypy will lose on those, but that provides incentive to fix the problem - for example, implementing a better long to string. People are going to write silly benchmarks and they are going to solve problems in silly ways. We should be honest about this in the benchmarks. Don't worry, pypy will do just fine. On Fri, May 31, 2013 at 12:01:27PM +0200, Carl Friedrich Bolz wrote:
Hi Armin,
I have only glanced at the code, but isn't the right argument of the divmod always a power of two? So it can be replaced by a shift and a mask, giving the right complexity.
No, it's a power of 10, 10^2^i in fact. njh
Cheers,
Carl Friedrich
----- End forwarded message -----

Hi Nathan, On Fri, May 31, 2013 at 5:48 PM, Nathan Hurst <njh@njhurst.com> wrote:
Actually I found an old but still in-progress bug report for CPython: http://bugs.python.org/issue3451 It contains all these questions and more. CPython seems mainly blocked by lack of man-power sufficient to review all the reference counting mess. That seems like a good reason to write the algorithms in PyPy first :-)
Yes, I believe you are right. We should really add these kinds of benchmarks too. It's a trap that is natural to fall into, and we should continue to motivate the reasons for why PyPy is not 5 times faster than CPython here. A bientôt, Armin.

Hi Nathan, On Fri, May 31, 2013 at 5:48 PM, Nathan Hurst <njh@njhurst.com> wrote:
Actually I found an old but still in-progress bug report for CPython: http://bugs.python.org/issue3451 It contains all these questions and more. CPython seems mainly blocked by lack of man-power sufficient to review all the reference counting mess. That seems like a good reason to write the algorithms in PyPy first :-)
Yes, I believe you are right. We should really add these kinds of benchmarks too. It's a trap that is natural to fall into, and we should continue to motivate the reasons for why PyPy is not 5 times faster than CPython here. A bientôt, Armin.
participants (2)
-
Armin Rigo
-
Nathan Hurst