
On Thu, Aug 19, 2010 at 03:25, Hart's Antler <bhartsho@yahoo.com> wrote:
I am starting to learn how to use the JIT, and i'm confused why my function gets slower over time, twice as slow after running for a few minutes. Using a virtualizable did speed up my code, but it still has the degrading performance problem. I have yesterdays SVN and using 64bit with boehm. I understand boehm is slower, but overall my JIT'ed function is many times slower than un-jitted, is this expected behavior from boehm?
code is here: http://pastebin.com/9VGJHpNa
I think this has nothing to do with Boehm. Is it swapping? If yes, that explains the slowdown. Is memory usage growing over time? I expect yes, and it's a misbehavior which could be explained by my analysis below. Is it JITting code? I think no, or not to an advantage, but that's a more complicated guess. BTW, when debugging such things, _always_ ask and answer these questions yourself. Moreover, I'm not sure you need to use the JIT yourself. - Your code is RPython, so you could as well just translate it without JIT annotations, and it will be compiled to C code. - Otherwise, you could write that as a app-level function, i.e. in normal Python, and pass it to a translated PyPy-JIT interpreter. Did you try and benchmark the code? Can I ask you why you did not write that as a app-level function, i.e. as normal Python code, to use PyPy's JIT directly, without needing detailed understanding of the JIT? It would be interesting to see a comparison (and have it on the web, after some code review). Especially, I'm not sure that as currently written you're getting any speedup, and I seriously wonder whether the JIT could give an additional speedup over RPython here (the regexp interpreter is a completely different case, since it compiles a regexp, but why do you compile an array?). I think just raw CPython can be 340x slower than C (I assume NumPy uses C), and since your code is RPython, there must be something basic wrong. I think you have too many green variables in your code: "At runtime, for a given value of the green variables, one piece of machine code will be generated. This piece of machine code can therefore assume that the value of the green variable is constant." [1] So, every time you change the value of a green variable, the JIT will have to recompile again the function. Note that actually, I think, for each new value of the variable, first a given number of iterations have to occur (1000? 10 000? I'm not sure), then the JIT will spend time creating a trace and compiling it. The length of the involved arrays is maybe around the threshold, maybe smaller, so you get "all pain, and no gain".
From your code: complex_dft_jitdriver = JitDriver( greens = 'index length accum array'.split(), reds = 'k a b J'.split(), virtualizables = 'a'.split() #can_inline=True )
The only acceptable green variable are IMHO array and length there, because in the calling code, the other change for each invocation I think. I also think that only length should be green (and that could give a speedup), and that marking array as green gives neglibible or no speedup. Marking length as green allows specializing the function on the size of the array - something one would not do in C probably, but that one could do in C++. Whether it is worth it depends on the specific code & optimizations available - I think here the speedup should be small. Best regards [1] http://morepypy.blogspot.com/2010/06/jit-for-regular-expression-matching.htm... -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/