Re: [Python-ideas] Optimizing builtins
On Sat, Jan 1, 2011 at 5:11 AM, Maciej Fijalkowski
On Sat, Jan 1, 2011 at 11:32 AM, Cesare Di Mauro wrote:
Yes, I know it, but the special opcode which I was talking about has a very different usage. The primary goal was to speed-up fors, generating specialized code when the proper range builtin is found at runtime, and it's convenient to have such optimized code. As you stated, the compiler don't know if range is a builtin until runtime (at the precise moment of for execution), so it'll generate two different code paths. The function's bytecode will look like that: 0 SETUP_LOOP 62 2 JUMP_IF_BUILTIN_OR_LOAD_GLOBAL 'range', 40 # Usual, slow, code starts here 40 LOAD_CONSTS (4, 3, 2, 1, 0) # Loads the tuple on the stack 44 LOAD_FAST_MORE_TIMES x, 5 # Loads x 5 times on the stack 46 LOAD_CONSTS (4, 3, 2, 1, 0) # Loads the tuple on the stack 48 STACK_ZIP 3, 5 # "zips" 3 sequences of 5 elements each on the stack 52 STORE_SUBSCR 54 STORE_SUBSCR 56 STORE_SUBSCR 58 STORE_SUBSCR 60 STORE_SUBSCR 62 POP_BLOCK 64 RETURN_CONST 'None' It's just an example; cde can be different based on the compiler optimizations and opcodes available in the virtual machine. The most important thing is that the semantic will be preserved (I never intended to drop it! ;)
The thing is, having a JIT, this all is completely trivial (as well as bunch of other stuff like avoiding allocating ints at all).
Right. That's a much saner solution than trying to generate bulky bytecode as Cesare proposed. The advantage of a JIT is also that it allows doing these optimizations only in those places where it matters. In general I am not much in favor of trying to optimize Python's bytecode. I prefer the bytecode to be dead simple. This probably makes it an easy target for CS majors interested in code generation, and it probably is a great exercise trying to do something like that, but let's please not confuse that with actual speed improvements to Python -- those come from careful observation (& instrumentation) of real programs, not from looking at toy bytecode samples. (Most of the bytecode improvements that actually made a difference were done in the first 5 years of Python's existence.)
Generating two different code paths has a tendency to lead to code explosion (even exponential if you're not careful enough), which has it's own set of problems (including cache locality, because code executed is no longer a small continuous chunk of memory). What we (PyPy) do, is to compile only the common path (using JIT) and then have unlikely path fall back to the interpreter. This generally solves all of nasty problems you can possibly encounter.
Great observation! -- --Guido van Rossum (python.org/~guido)
On 1/1/2011 11:41 AM, Guido van Rossum wrote:
In general I am not much in favor of trying to optimize Python's bytecode. I prefer the bytecode to be dead simple.
I think people constantly underestimate the virtue of Python and CPython simplicity. Projects that depend on a couple of genius ubergeeks die when the ubergeeks leave. The executable-pseudocode simplicity of the language makes it a favorite for scientific programming, spilling over into financial programming. The simplicity of the code allows competent students (and non-CS major adults) become developers. -- Terry Jan Reedy
On Sat, Jan 1, 2011 at 12:30 PM, Terry Reedy
On 1/1/2011 11:41 AM, Guido van Rossum wrote:
In general I am not much in favor of trying to optimize Python's bytecode. I prefer the bytecode to be dead simple.
I think people constantly underestimate the virtue of Python and CPython simplicity. Projects that depend on a couple of genius ubergeeks die when the ubergeeks leave. The executable-pseudocode simplicity of the language makes it a favorite for scientific programming, spilling over into financial programming. The simplicity of the code allows competent students (and non-CS major adults) become developers.
And, of course, the (relative) simplicity of the implementation will always draw CS students looking for compiler optimization projects (just as the simplicity of the language draws CS students looking to write a complete compiler). But it's one thing to get a degree out of some clever optimization; it's another thing to actually make it stick in the context of CPython, with the concerns you mention (and others I only have in my guts :-). -- --Guido van Rossum (python.org/~guido)
On 1/1/2011 3:37 PM, Guido van Rossum wrote:
And, of course, the (relative) simplicity of the implementation will always draw CS students looking for compiler optimization projects
And, ironically, slightly reduce the simplicity that attracted them. No one thinks that their straw will break the camel's back (or cause him to drop to his knees), and they are usually right. But when the camel sags, all added straws are equally responsible.
(just as the simplicity of the language draws CS students looking to write a complete compiler). But it's one thing to get a degree out of some clever optimization; it's another thing to actually make it stick in the context of CPython, with the concerns you mention (and others I only have in my guts :-).
For one thing, you have your eye on the camel ;-). And your current job keep you grounded in the needs of real code. (In a current python-list discussion, someone demonstrated with timeit that in late 2.x, each iteration of 'while 1: pass' takes about a microsecond less than for 'while True: pass'. The reason for that, and the disappearance of the difference in 3.x is mildly interesting, but the practical import for any real code that does anything inside the loop is essentially 0.) -- Terry Jan Reedy
2011/1/1 Guido van Rossum
Right. That's a much saner solution than trying to generate bulky bytecode as Cesare proposed. The advantage of a JIT is also that it allows doing these optimizations only in those places where it matters.
In general I am not much in favor of trying to optimize Python's bytecode. I prefer the bytecode to be dead simple.
If Python direction is to embrace some JIT technology, I fully agree with you: it is best to make VM & compiler simpler. Anyway, and as I already said before, mine were just examples of possible things that can happen with optimizations.
This probably makes it an easy target for CS majors interested in code generation, and it probably is a great exercise trying to do something like that, but let's please not confuse that with actual speed improvements to Python -- those come from careful observation (& instrumentation) of real programs, not from looking at toy bytecode samples. (Most of the bytecode improvements that actually made a difference were done in the first 5 years of Python's existence.)
--Guido van Rossum (python.org/~guido)
But research never stops. SETUP_WITH is just a recent example. Also, sometimes completely different ideas can bring some innovation. ;) Cesare
participants (3)
-
Cesare Di Mauro
-
Guido van Rossum
-
Terry Reedy