[pypy-dev] PyPy doesn't make code written in C faster
Bengt Richter
bokr at oz.net
Fri May 31 15:01:03 CEST 2013
On 05/30/2013 06:41 PM Nathan Hurst wrote:
> On Thu, May 30, 2013 at 10:23:17AM +0200, Armin Rigo wrote:
>> Hi all,
>>
>> Some people learn about PyPy, and the first program they try to
>> measure speed with is something like this:
>>
>> def factorial(n):
>> res = 1
>> for i in range(1, n + 1):
>> res *= i
>> return res
>> print factorial(25000)
>>
>> It may not be completely obvious a priori, but this is as bogus as it
>> gets. This is by now only 50% slower in PyPy than in CPython thanks
>> to efforts from various people. The issue is of course that it's an
>> algo which, in CPython or in PyPy, spends most of its time in C code
>> computing with rather large "long" objects. (No, PyPy doesn't contain
>> magic to speed up C code 10 times.) In fact, this program spends more
>> than 2/3rd of its time in the final repr() of the result! Converting
>> a long to base 10 is a quadratic operation.
>
> It doesn't have to be quadratic, it's easy to come up with a splitting
> algorithm:
> def reclongtostr(x):
> if x< 0: return "-"+reclongtostr(-x)
> x = long(x) # expect a long
> min_digits = 9 # fits in 32 bits, there may be a better choice for
> this
> pts = [10**min_digits]
> while pts[-1]< x:
> pts.append(pts[-1]**2)
> pts.pop() # remove first 10**2**i greater than x
> output = []
> def spl(x,i):
> if i< 0: # bottomed out with max_digit sized pieces
> if output or x> 0:
> s = str(x)
> output.append("0"*(min_digits - len(s)) + s) # note
> that this appends in inorder
> else:
> top,bot = divmod(x, pts[i]) # split the number
> spl(top,i-1)
> spl(bot,i-1)
> spl(x,len(pts)-1)
> # strip leading zeros, we can probably do this more elegantly
> while output[0][0] == "0":
> output[0] = output[0][1:]
> return ''.join(output)
>
> which benchmarks factorial(25000) like this:
>
> import time
> s = time.time()
> x = factorial(25000)
> print "factorial", time.time() - s
> sx = str(x) # give pypy a chance to compile
> s = time.time()
> sx = str(x)
> print "Str time", time.time() - s
> rsx = reclongtostr(x) # give pypy a chance to compile
> s = time.time()
> rsx = reclongtostr(x)
> print "my rec time", time.time() - s
> print "equal string:", sx == rsx
>
> factorial 0.182402133942
> Str time 0.505062818527
> my rec time 0.0678248405457
> equal string: True
>
>
> I'm sure a better programmer than I could make this faster by avoiding
> saving intermediate results and various micro optimisations. But
> beating the builtin C implementation by a factor of 7.5 seems a
> reasonable outcome for pypy.
>
> I think I could come up with a linear time two pass algorithm working
> on intdigits if this were important to pypy.
>
>> Does it still make sense to add programs like this to our benchmarks?
>> So far, our benchmarks are "real-life" examples. The benchmarks like
>> above are completely missing the point of PyPy, as they don't stress
>> at all the Python interpreter part. There are also other cases where
>> PyPy's performance is very bad, like cpyext on an extension module
>> with lots of small C API calls. I believe that it would still make
>> sense to list such cases in the official benchmark, and have the
>> descriptions of the benchmarks explain what's wrong with them.
>
> I agree that you should include them, I disagree that they are
> 'wrong'. They measure the overhead of a C call. Why should a C call
> be slower in pypy than cpython? Presumably it could be compiled down
> to the appropriate instructions and then out-perform cpy.
>
> Now that the topic of benchmarks has come up, I came across this
> benchmark recently:
> http://dalkescientific.com/writings/diary/archive/2009/11/15/100000_tasklets.html
>
> The same benchmark took 8.5s on pypy 2beta2 and takes 7.5s on pypy
> 2.0.1. Is there any obvious reasons why pypy's tasklets are so
> slow to switch? (Is it the scheduler?) This is important for my
> adoption of pypy at work.
>
> njh
I remember doing something similar to convert long to string,
way back in .. let's see .. googling google comp.lang.python ..
oy, 2001, it's been a while ;-)
https://groups.google.com/forum/?hl=sv&fromgroups#!searchin/comp.lang.python/richter$20strL.py/comp.lang.python/6HYHojX7ZlA/Wizytwby71QJ
(from top, search to first instance of strL.py)
I guess it's not too long to post a copy here:
.. (Hm, I see I should have imported time from time
instead of clock, since the latter has just scheduler
tick resolution, and time is high resolution.)
____________________________________________________
A couple of lines wrapped...
_______________________________________________________________
# strL.py -- recursive long to decimal string conversion
# 2001-07-10 bokr
#
p10d={}
def p10(n):
if not p10d.has_key(n):
p = p10d[n] = 10L**n
return p
return p10d[n]
def strLR(n,w=0): # w>0 is known width, w<0 is searching guess
if w == 0: return []
if w > 0:
if w <=9: return [('%0'+chr(w+48)+'d') % n]
wr = w/2
nl,nr = divmod(n,p10(wr))
return strLR(nl,w-wr)+strLR(nr,wr)
else:
nl,nr = divmod(n,p10(-w))
if nl:
return strLR(nl, 2*w) + strLR(nr,-w)
else:
if w >= -9:
return ['%d' % n]
else:
return strLR(nr,w/2)
def strL(n):
if n<0:
return ''.join(['-']+strLR(-n,-9))
else:
return ''.join(strLR(n,-9))
from time import clock
import sys
def main():
def pr(x):
print x
def psl(x):
s = strL(x)
sys.stdout.write(s)
sys.stdout.write('\n')
dt={'str':str,'strL':strL,'repr':repr, 'print':pr, 'printStrL':psl
}
try:
x=long( eval(sys.argv[2]) )
fn=sys.argv[1]
fcn=dt[fn]
except:
sys.stderr.write("usage: %s [str strL repr print printStrL]
<const expr>\n" % sys.argv[0])
sys.exit(2)
t0=clock()
fcn(x)
t1=clock()
print "%s took %9.6f seconds" % (fn,t1-t0)
if __name__ == "__main__":
main()
____________________________________________________
Got curious, so I added to your benchmark:
from strL import strL ## bokr mod
and
## bokr mod: add timing of strL, like above
rsb = strL(x) # give pypy a chance to compile
s = time.time()
rsx = strL(x)
print "strL rec time", time.time() - s
print "equal string:", sx == rsb
then ran it (with old python and pypy on old laptop:
Version info:
[14:49 ~/wk/py/clp]$ python
Python 2.7.2 (default, Jul 8 2011, 23:38:53)
[GCC 4.1.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
[14:49 ~/wk/py/clp]$ pypy
pypy: /usr/lib/libcrypto.so.0.9.8: no version information available (required by pypy)
pypy: /usr/lib/libssl.so.0.9.8: no version information available (required by pypy)
Python 2.7.1 (b590cf6de419, Apr 30 2011, 02:00:38)
[PyPy 1.5.0-alpha0 with GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``nothing is true''
>>>>
The bench run:
[14:49 ~/wk/py/clp]$ python NathanHurst.py
factorial 1.36516094208
Str time 2.93479013443
my rec time 1.45956683159
equal string: True
strL rec time 1.34501504898
equal string: True
[14:50 ~/wk/py/clp]$ pypy NathanHurst.py
pypy: /usr/lib/libcrypto.so.0.9.8: no version information available (required by pypy)
pypy: /usr/lib/libssl.so.0.9.8: no version information available (required by pypy)
factorial 2.29024791718
Str time 3.14243102074
my rec time 1.25054502487
equal string: True
strL rec time 1.12671113014
equal string: True
[14:50 ~/wk/py/clp]$
... Hm, wonder why your factorial was slower on pypy .. prob due to old version?
Regards,
Bengt Richter
More information about the pypy-dev
mailing list