[Python-Dev] optimizing non-local object access
Samuele Pedroni
Samuele Pedroni <pedroni@inf.ethz.ch>
Thu, 9 Aug 2001 20:50:33 +0200 (MET DST)
> SP> - find a way to detect some conservative approximation of the
> SP> set of slots of a concrete class
>
> I think this can be done with better compiler support.
>
The simplest thing would be to detect all 'attr' that appear in the body of
the methods of a class statements in 'self.attr' form. Class construction at
run-time will take the union of this together with the information computed
for base classes Why at runtime? It avoids the trouble of
poor-man-to-full-fledged type-inference for the compiler. It would be
reasonably effective and at most would require the programmer to put a
self.myFavoriteAttr = None in __init__.
> SP> - use customization (consumes memory) so you can exploit a fixed
> SP> layout for the approximated slots
> SP> - use plain dictionary lookup for the slots missed that way
>
> Right
Clearly this should be played two-hands with method dispatch.
>
> SP> Not impossible, not trivial and surely a thing that will pollute
> SP> code clarity, OTOH whith this kind of optimizations in place,
> SP> even just at the interp level, effective native compilation is
> SP> not that far ...
>
> It is a lot of work for instances, so I've been considering it
> separately from the module globals issue. But I think it is
> possible. Short of native-code, I wonder if we can extend the Python
> VM with opcodes that take advantage of slots.
It is indeed a good idea, but the problem is that customising without native
compilation can have a very poor compilation-time vs. runtime speedup
ratio. OTOH we could customize only hot-spots. But I think is worth trying
if we ever get at it.
> I'm not sure how
> closely this idea is related to Pysco.
AFAIK it would be complementary, Armin Rigo pointed out that in principle
Psyco can layout objects at will, but I remarked that is way of dealing
with code comsumes a lot of memory so it is workable only if applied only to
hot spots, and then we pay for conversion when non-Pysco code calls Psyco code.
Psyco is doing some sort of customization (a generalisation of that), a
layout for instances that can be exploited in term of speed under
customization is a good thing for Psyco too IMHO.
> Imagine we didn't special case ints in BINARY_ADD. If you're doing
> mostly string addition, it's just slowing you down anyway :-).
> Instead, a function that expected to see ints would check the types of
> the relevant args.
>
> def f(x):
> return 2*x
>
> would become something like:
>
> def f(x):
> if isinstance(x, int):
> return 2 * x_as_int
> else:
> return 2 * x
>
> where the int branch could use opcodes that knew that x was an integer
> and could call its __add__ method directly.
AFAIK this is somehow what Psyco does, I have not checked the code but my understanding is that
it goes a hard way to do this kind of thing specifically when they can give you something.
Customization would mean the following:
class A:
def __init__(self, a=2):
self.a = a
def double(self):
self.a*=2
class B:
def __init__(self, b=3):
self.b = b
def inc(self):
self.b+=1
class C(A,B):
def __init__(self,a,b):
A.__init__(a)
B.__init__(b)
def mul(self):
return a*b
under customization if both instances of A and C are used and in both cases
double is used there will be two versions of double to choose from, one that
expect self to be of concrete class A and one that expect it to be of
concrete class C. Both these versions can be constructed knowing the exact
place of the slot a in the objects.
It works clearly best with good pure single dispatch oo code ;).
In any case both a more static layout for slots and some way to add hooks
that count how many times a code object is invoked, would be a good start
point for a lot of possible nice experiments.
regards, Samuele.