Re: psyco in Python was: Minimal Python project
Somehow it seems like Michaels reply got to the mail server before my original message. Here it is again. My apologies if this is redundant: Hi Michael, I'm going to copy this to pypy-dev so that other people can see this discussion.
I just don't see how to implement psyco in Python in a production package without using the C interpreter as a bootstrap! And yes, I suspect that a C language version of psyco will be faster than psyco in Python.
Even after you've run psyco over itself?
Yes, maybe even then. C compilers should do better _on the same algorithm_ on hand-written code than can any GIT.
Weeeeellll, this is the thing that gets me jumping-up-and-down-excited about things like psyco: pysco has MORE information than the C compiler to work with: while the C compiler only knows the algorithm to run, psyco knows both the algorithm *and the data it is to run on*. Obviously, this approach only wins if what you do next resembles what you did last, but I think CPU caches have proved there is some mileage in that idea :)
I don't think CPU caches have much bearing on this discussion. Caches speed up frequently used values. The problems here are not the same. The more I study psyco, the more I am convinced that there is _no way_ that psyco will ever produce better code than a C compiler. In brief, my reasons are as follows: 1. psyco has, in fact, no better information than does a C compiler. Remember that the "values" that psyco specializes upon are in fact _interpreter_ vars, so knowing those values is equivalent to knowing the types of operands. This does indeed allow psyco to emit machine code that is better than the equivalent C interp generic code. In fact, though, most of what psyco is doing is quite routine: replacing bytecode by assembly language. This will speed up the code a _lot_, but it is doing just like a C compiler does. 2. I have looked again at the code generators of the optimizing C compiler I wrote about 10 years ago. I can find no instance where knowing the value of an operand, rather than just its type, would produce better code. Yes, my compiler did constant folding (which eliminates code rather than improves it), but I do not believe doing constant folding at runtime is going to improve matters much, and is in fact very likely to slow down the running times of typical programs, for reasons given below. 3. It's easy to get seduced by the "psyco has more knowledge" mantra. In fact, one mustn't lose track of the obstacles psyco faces in outperforming C compilers. First, and very important, everything that psyco does must be done at run time. This means that even very effective optimizations like peephole optimizations become dubious for psyco. Second, and also very important, is that fact that psyco (or rather the code that psyco produces) must determine whether the code that psyco has already produced is suitable for the task at hand. The "Structure of Psyco" paper says this: "Dispatching is the set of techniques used to store a trace of the compiled code buffers and _find if one already matches the current compiler state_" (emphasis mine) In other words, it's not enough to specialize code! Psyco must determine whether the types used to compile a function match the types presently given to psyco. This searching is done in the "compatible" routine, and no matter how it is done is _pure overhead_ compared to the code generated by a C compiler. As I see it, both these issues are "gotchas". Neither can be eliminated completely, and each introduces overhead that will make the code emitted by psyco clearly inferior to the code produced by any decent C compiler. In particular, the more psyco specializes on runtime values of operands the bigger the code bloat and the more severe the problem of finding out what already-compiled code to use. 4. I have seen _nothing_ in psyco's algorithms or the code emitted by psyco that leads me to believe that psyco is "doing enough work" to outperform a C compiler. When I was a mathematics graduate student 30 years ago one of my professors made the remark that a mathematician need to develop a feel for the level of work required to do a new proof. Deep insights are not going to come from elementary methods, and there is no use in using deep methods to prove elementary results. I believe the same kind of analysis can (and should!) be done here. Yes, psyco is clever. In some ways it is _very_ clever, especially the idea of modeling the compiler on the structure on the original C interp. Yes, it is easy to seen how psyco can improve the running speed of the present C interpreter: _any_ compiling will do so! But with respect, I dispute the assertion that knowing actual values of operands is going to produce faster code that that produce by a decent C compiler applied to an _equivalent_ algorithm done in C. Please note: it is _precisely_ this extremely strong assertion that must be true for there to be any speedup gained in translating Python's C libraries into Python! We already have C libraries that work very well. I would be _very_ leery of messing with them... 5. Another way to see that psyco isn't really doing much is the following. Basically, psyco emits calls to the same runtime helpers that the C interpreter does. Yes, psyco may improve the calling sequences to these helpers, but it does not eliminate calls to those helpers. But a C compiler will never do worse that the code in the helpers, and will often do _much_ better. I challenge this group to provide even one example of Python code, for which psyco will emit better code than that produced by a C compiler on a transliteration of that code into C. I believe no such counter-example will ever be found. I am not much interested in anecdotal reports of Python being faster than C, and I would be happy to study in depth any purported counterexample. If you believe you have a real counterexample, please first make sure that the algorithms appear identical. Second, please try to make the counterexample as small as possible. If I am wrong, the counter-example should provide a _clear reason_ why all my arguments above are wrong. And wouldn't that be great :-) I believe the arguments given should be given greater weight than the "psyco has more knowledge" mantra. You want to convince me otherwise? Show me the code, or provide a _detailed_ explanation for how my arguments do not apply. Edward P.S. I trust that this group will interpret my remarks as constructive. I believe it is important at the beginning of any project to have proper goals and expectations. Make no mistake: I believe this project has the potential to be extremely valuable, even if my arguments are correct. And I would indeed be happy to be _proved_ wrong. However, I think it a dubious policy to assume that Python can outperform C, for several reasons: 1. There is no need to burden this project with unrealistic expectations. Python doesn't have to beat C for Python to rule the world! :-) 2. I am concerned that assuming that we can do the impossible will skew our efforts. Yes, by all means, experiment using Python! Using C to do experiments would not be too swift :-) But lets not spend a lot of time worrying about bootstrapping until we are _sure_ that we won't be doing the final version in C! 3. There may be another way to speed up Python, namely replace (parts of) the byte code with compiled code. Armin mentioned in his papers that some ways of doing so might produce code bloat. But is that the end of the story? I think not. The same tricks that psyco uses in the "dispatcher" might well be applied to intermix machine code with byte code. I wouldn't want to focus _only_ on the "psyco way (tm)" in this project. Compile-time optimizations deserve at least some considerations, IMO. EKR -------------------------------------------------------------------- Edward K. Ream email: edream@tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html --------------------------------------------------------------------
Edward K. Ream wrote:
I don't think CPU caches have much bearing on this discussion. Caches speed up frequently used values. The problems here are not the same.
I think what you mean here is frequently used code/data as CPU caches are address based, not value based; locality of reference.
2. I have looked again at the code generators of the optimizing C compiler I wrote about 10 years ago. I can find no instance where knowing the value of an operand, rather than just its type, would produce better code.
Zero. Many redundant instructions can be eliminated on a wide variety of architectures due to the way the value zero is handled by the CPU; results of the previous instruction may set condition codes, instructions that test for zero, ... Knowlege of values can better construct jump tables for switches.
Knowlege of values can better construct jump tables for switches.
Sure. You could get rid of the entire table. This kind of trick is why I mentioned constant folding. The key questions concerning "runtime constant folding" are: 1. how much work does it take to discover the particular value? 2. how much work does it take to generate code to use that particular value? and most importantly, 3. how much work does it take to _reuse_ the "specified" code? Edward -------------------------------------------------------------------- Edward K. Ream email: edream@tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html --------------------------------------------------------------------
Hello Edward, All your comments about Psyco are founded, but you are focusing too much on the "back-end" part --- which I understand, given your impressive compiler technology background! On Fri, Jan 17, 2003 at 10:34:40AM -0600, Edward K. Ream wrote:
1. psyco has, in fact, no better information than does a C compiler.
People answered that Psyco can have a bit more information, because it could do more constant folding or optimize multiplications by constants. Rarely a huge win. But that's not what I had in mind when saying that "Psyco has more information". It has a much higher-level view of the program. If you have a specific algorithm which you completely coded in a couple of C functions, then after compiling it with a good C compiler the algorithm will run at a speed that cannot be beaten. But C is not well suited for larger applications consisting mainly of management stuff --- precisely what Python is much better for (but you know this well, I'm sure). These are the cases that interest me. Python gives a higher-level view of the application. Psyco can, for example, measure that some data structure (say a list) is most used in this or that way, and choose a suited implementation. For example, a list in the middle of which numerous inserts and deletes are done could be implemented as a red-black tree. Of course, in a pure C implementation of the application we can also use red-black trees, but who does? Not many C application actually use the correct implementation of their data structures :-( More importantly, which implementation is the best one must be hard-wired in advance in C and makes it difficult to switch later. This is where I expect interesting gains from a sufficiently advanced Psyco. There is already one example of this. In Python, if you build a large string by successively concatenating a lot of small strings, you get bad results. You have to rewrite your algorithm in a less straightforward style, e.g. accumulating the strings in a list and only at the end using "''.join(list)". You have the same problem in C if you repeatedly use a simple two-strings concatenation function, but the C compiler cannot do anything to help here. Psyco (upcoming version 1.0) can already help: it compiles the Python function by choosing to implement the string as a Python list of strings. The "join()" is only done when the resulting string is needed outside the function. It is a case where higher-level programming languages let compiling tools select algorithms that actually decrease the complexity --- meaning that the result can run faster than the C version by more than a constant factor. Of course, you could have made the C version wiser in this case. But again you cannot do this when your application becomes very large and where you just don't know what implementation is better without doing sophisticated profiling on hopefully representative sample data.
1. There is no need to burden this project with unrealistic expectations. Python doesn't have to beat C for Python to rule the world! :-)
Yes, exactly. I hope I have made it clear that Psyco is not the all-or-nothing way for this project to succeed. In my opinion it is essential to write the Python-in-Python interpreter with important restrictions, so that static tools can do a lot from this code. Compile-time optimizations of Python, if you like, althought I prefer to see it as translation from a well-defined Pythonic frame to any of several possible projects that could use the source (including a CPython-like interpreter, and including a Psyco-like project). The current goal should clearly be to have a Python interpreter in Python, written with restrictions that are well understood. A bientot, Armin.
participants (3)
-
Armin Rigo -
Boyd Roberts -
Edward K. Ream