[Patches] [ python-Patches-943898 ] A simple 3-4% speed-up for PCs

SourceForge.net noreply at sourceforge.net
Wed Apr 28 17:02:17 EDT 2004


Patches item #943898, was opened at 2004-04-28 18:33
Message generated for change (Comment added) made by arigo
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=943898&group_id=5470

Category: Core (C code)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Armin Rigo (arigo)
Assigned to: Nobody/Anonymous (nobody)
Summary: A simple 3-4% speed-up for PCs

Initial Comment:
The result of a few experiments looking at the assembler produced by gcc for eval_frame():

* on PCs, reading the arguments as an unsigned short instead of two bytes is a good win.

* oparg is more "local" with this patch: its value doesn't need to be saved across an iteration of the main loop, allowing it to live in a register only.

* added an explicit "case STOP_CODE:" so that the switch starts at 0 instead of 1 -- that's one instruction less with gcc.

* it seems not to pay off to move reading the argument at the start of each case of an operation that expects one, even though it removes the unpredictable branch "if (HAS_ARG(op))".

This patch should be timed on other platforms to make sure that it doesn't slow things down.  If it does, then only reading the arg as an unsigned short could be checked in -- it is compilation-conditional over the fact that shorts are 2 bytes in little endian order.

By the way, anyone knows why 'stack_pointer' isn't a 'register' local?  I bet it would make a difference on PowerPC, for example, with compilers that care about this keyword.

----------------------------------------------------------------------

>Comment By: Armin Rigo (arigo)
Date: 2004-04-28 21:02

Message:
Logged In: YES 
user_id=4771

stack_pointer isn't a register because its address is taken at two places.  This is a really bad idea for optimization.  Instead of &stack_pointer, we should do:

PyObject **sp = stack_pointer;
... use &sp ...
stack_pointer = sp;

I'm pretty sure this simple change along with a 'register' declaration of stack_pointer gives a good speed-up on all architectures with plenty of registers.

For PCs I've experimented with forcing one or two locals into specific registers, with the gcc syntax  asm("esi"), asm("ebx"), etc.  Forcing stack_pointer and next_instr gives another 3-4% of improvement.

Next step is to see if this can be done with #if's for common compilers beside gcc.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=943898&group_id=5470



More information about the Patches mailing list