[Python-Dev] Who cares about the performance of these
opcodes?
Phillip J. Eby
pje at telecommunity.com
Tue Mar 9 08:59:52 EST 2004
At 07:38 AM 3/9/04 -0600, Jeff Epler wrote:
>Recently it was proposed to make a new LIST_APPEND opcode, and several
>contributors pointed out that adding opcodes to Python is always a dicey
>business because it may hurt performance for obscure reasons, possibly
>related to the size of that 'switch' statement.
>
>To that end, I notice that there are several opcodes which could easily
>be converted into function calls. In my code, these are not typically
>performance-critical opcodes (with approximate ceval.c line count):
> BUILD_CLASS # 9 lines
> MAKE_FUNCTION # 20 lines
> MAKE_CLOSURE # 35 lines
>
> PRINT_EXPR # 21 lines
> PRINT_ITEM # 47 lines
> PRINT_ITEM_TO # 2 lines + fallthrough
> PRINT_NEWLINE # 12 lines
> PRINT_NEWLINE_TO # 2 lines + fallthrough
>
>Instead, each of these would be available in the code objects co_consts
>when necessary. For example, instead of
> LOAD_CONST 1 (<code object g at 0x40165ea0, file
> "<stdin>", line 2>)
> MAKE_FUNCTION 0
> STORE_FAST 0 (g)
>you'd have
> LOAD_CONST 1 (type 'function')
> LOAD_CONST 2 (<code object g>)
> LOAD_GLOBALS # new opcode, or call
> globals()
> LOAD_CONST 1 ("g")
> CALL_FUNCTION 3
>
>Performance for these specific operations will certainly benchmark worse,
>but maybe getting rid of something like 150 lines from ceval.c would
>help other things by magic. The new LOAD_GLOBALS opcode would be less
>than 10 lines.
>
>No, I don't have a patch. I assume each and every one of these opcodes
>has a staunch defender who will now come to its aid, and save me the
>trouble.
If the goal is to remove lines from the switch statement, just move the
code of lesser-used opcodes into a C function. There's no need to
eliminate the opcodes themselves.
I personally don't think it'll help much, if the goal is to reduce cache
misses. After all, the code is all still there. But, it should not do as
badly as the approach you're suggesting, because for your case you'll not
only have the C-level calls, but also more bytecodes being interpreted.
Hm. Makes me wonder, actually, if a hand-written eval loop in assembly
code might not kick some serious butt. Or maybe a bytecode-to-assembly
translator, writing loads in-line and using registers as the stack, calling
functions where necessary. Ah, if only I were a teenager again, with
little need to sleep, and unlimited time to hack... :)
More information about the Python-Dev
mailing list