[pypy-dev] math calls and errno

Douglas McNeil d.mcneil at qmul.ac.uk
Wed Jul 30 00:51:59 CEST 2008



</lurk>

A while back I noticed that some math-related rpython-produced C was much 
slower than it should have been.  After I figured out what was going on, I 
set it aside, but I see someone mentioned doing some numpy stuff on IRC 
today so I dug up the tests.

Note that the following slowdown only applies to translated code, so 
ctypes interfaces to numpy itself are immune, and the slowdown doesn't 
have much effect on pypy-c, so the audience for this is probably limited.

Summary: in typical cases the way pypy treats errno causes math calls in 
rpython to take over a third longer than they should.  This can be 
repaired by (carefully) inlining the errno functions.

Details follow.

--

A simple rpython target that does nothing but loop and add the results of 
sin() is much slower than the same code after being translated by 
shedskin, which gave the same time as the equivalent cython code, which 
gave the same time as the equivalent handwritten C.

def f():
     z = 0.0
     for i in xrange(100000):
         for j in xrange(500):
             z += sin(float(i+j))
     return z

gcc 4.2.2 (-O3 -fomit-frame-pointer):
rpython:		4.160 s
shedskin 0.0.28:	3.043 s
cython:			3.041 s
C:			3.034 s

The rpython/C gap persisted with the Intel compiler:

icc 10.1 (")
rpython:		3.296 s
C			2.118 s

so clearly the rpython code was doing something that the others weren't 
which the compilers particularly disliked and the obvious suspect was the 
error handling.  But that should be on the order of a few percent, not an 
extra third or half.

Turns out the errno calls aren't being inlined -- which makes sense, 
they're external to the implement_*.c files -- and if you force it they're 
often optimized away.  It's quite fragile, thanks to volatility.  But 
replacing the start of pypy_g_ll_math_ll_math_sin with something like

       volatile int *errno_loc = &errno;

     block0:
         l_v591 = (long)(0L);
         *(errno_loc) = l_v591;
         l_v590 = sin(l_x_1);
         l_v589 = *(errno_loc);
         OP_INT_IS_TRUE(l_v589, l_v593);
         if (l_v593) {
                 goto block2;
         }
         l_v597 = l_v590;
         goto block1;

you get

pypy_g_ll_math_ll_math_sin:
.L370:
         pushl   %ebx    #
         subl    $8, %esp        #,
         call    __errno_location        #
         fldl    16(%esp)        # l_x_1
         movl    %eax, %ebx      #, D.11955
         movl    $0, (%eax)      #,* D.11955
         fstpl   (%esp)  #
         call    sin     #
         movl    (%ebx), %eax    #* D.11955, l_v589
         testl   %eax, %eax      # l_v589
         je      .L372   #,

which is at least much improved over the original, and unlike my first few 
attempts I think this one correctly survives being optimized under both 
gcc and icc.  (Of course what's actually executed in the tests is the 
version of this which gets inlined into entry_point, but ll_math_sin gets 
inlined into entry_point in all versions, so that's not causing the 
difference.)  And we're still calling __errno_location like we should.

Anyway, now we have:

gcc 4.2.2	std rpython		4.160 s
gcc 4.2.2	pure C			3.034 s
gcc 4.2.2	errno-inlined rpython	3.068 s

icc 10.1	std rpython		3.296 s
icc 10.1	pure C			2.118 s
icc 10.1	errno-inlined rpython	2.244 s

and that's much better, especially with gcc.

Standard disclaimers apply: this worked on my linux x86 system, with the 
above compiler versions, under an almost new moon.. and gcc is known for 
dramatic variations between versions.  If it works for anyone else I'd be 
pleasantly surprised.  That said, icc agrees.

Unfortunately, as you might expect, this has very modest effects on pypy-c 
(i.e. barely detectable over the noise, due to the overhead), so I don't 
know if there will be interest in modifying the C backend to change it.

It does improve things considerably for rpythonic math module writers, 
though.  FWIW.

<relurk>


Doug, pypy-math-sig member-in-waiting


--
Queen Mary College, University of London      "Still creating worlds..
Mathematical Sciences, Astronomy Unit          .. but now with an accent!"



More information about the Pypy-dev mailing list