[Python-Dev] Optimization idea

Thu, 19 Aug 1999 10:15:28 -0400

> I just had yet another idea for optimizing Python that looks so
> plausible that I guess someone else must have looked into it already
> (and, hence, probably rejected it:-):
> 
> We add to the type structure a "type identifier" number, a small integer for 
> the common types (int=1, float=2, string=3, etc) and 0 for everything else.
> 
> When eval_code2 sees, for instance, a MULTIPLY operation it does something 
> like the following:
>    case BINARY_MULTIPLY:
> 	w = POP();
> 	v = POP();
> 	code = (BINARY_MULTIPLY << 8) |
> 		((v->ob_type->tp_typeid) << 4) |
> 		((w->ob_type->tp_typeid);
> 	x = (binopfuncs[code])(v, w);
> 	.... etc ...
> 
> The idea is that all the 256 BINARY_MULTIPLY entries would be filled with 
> PyNumber_Multiply, except for a few common cases. The int*int field could 
> point straight to int_mul(), etc.
> 
> Assuming the common cases are really more common than the uncommon cases the 
> fact that they jump straight out to the implementation function in stead of 
> mucking around in PyNumber_Multiply and PyNumber_Coerce should easily offset 
> the added overhead of shifts, ors and indexing.

You're assuming that arithmetic operations are a major time sink.  I
doubt that; much of my code contains hardly any arithmetic these days.

Of course, if you *do* have a piece of code that does a lot of basic
arithmetic, it might pay off -- but even then I would guess that the
majority of opcodes are things like list accessors and variable.

But we needn't speculate.  It's easy enough to measure the speedup:
you can use tp_xxx5 in the type structure and plug a typecode into it
for the int and float types.  

(Note that you would need a separate table of binopfuncs per
operator.)

--Guido van Rossum (home page: http://www.python.org/~guido/)