Re: [Python-Dev] optimizing non-local object access
SP> - find a way to detect some conservative approximation of the SP> set of slots of a concrete class
I think this can be done with better compiler support.
The simplest thing would be to detect all 'attr' that appear in the body of the methods of a class statements in 'self.attr' form. Class construction at run-time will take the union of this together with the information computed for base classes Why at runtime? It avoids the trouble of poor-man-to-full-fledged type-inference for the compiler. It would be reasonably effective and at most would require the programmer to put a self.myFavoriteAttr = None in __init__.
SP> - use customization (consumes memory) so you can exploit a fixed SP> layout for the approximated slots SP> - use plain dictionary lookup for the slots missed that way
Right Clearly this should be played two-hands with method dispatch.
SP> Not impossible, not trivial and surely a thing that will pollute SP> code clarity, OTOH whith this kind of optimizations in place, SP> even just at the interp level, effective native compilation is SP> not that far ...
It is a lot of work for instances, so I've been considering it separately from the module globals issue. But I think it is possible. Short of native-code, I wonder if we can extend the Python VM with opcodes that take advantage of slots.
It is indeed a good idea, but the problem is that customising without native compilation can have a very poor compilation-time vs. runtime speedup ratio. OTOH we could customize only hot-spots. But I think is worth trying if we ever get at it.
I'm not sure how closely this idea is related to Pysco.
AFAIK it would be complementary, Armin Rigo pointed out that in principle Psyco can layout objects at will, but I remarked that is way of dealing with code comsumes a lot of memory so it is workable only if applied only to hot spots, and then we pay for conversion when non-Pysco code calls Psyco code. Psyco is doing some sort of customization (a generalisation of that), a layout for instances that can be exploited in term of speed under customization is a good thing for Psyco too IMHO.
Imagine we didn't special case ints in BINARY_ADD. If you're doing mostly string addition, it's just slowing you down anyway :-). Instead, a function that expected to see ints would check the types of the relevant args.
def f(x): return 2*x
would become something like:
def f(x): if isinstance(x, int): return 2 * x_as_int else: return 2 * x
where the int branch could use opcodes that knew that x was an integer and could call its __add__ method directly. AFAIK this is somehow what Psyco does, I have not checked the code but my understanding is that it goes a hard way to do this kind of thing specifically when they can give you something.
Customization would mean the following: class A: def __init__(self, a=2): self.a = a def double(self): self.a*=2 class B: def __init__(self, b=3): self.b = b def inc(self): self.b+=1 class C(A,B): def __init__(self,a,b): A.__init__(a) B.__init__(b) def mul(self): return a*b under customization if both instances of A and C are used and in both cases double is used there will be two versions of double to choose from, one that expect self to be of concrete class A and one that expect it to be of concrete class C. Both these versions can be constructed knowing the exact place of the slot a in the objects. It works clearly best with good pure single dispatch oo code ;). In any case both a more static layout for slots and some way to add hooks that count how many times a code object is invoked, would be a good start point for a lot of possible nice experiments. regards, Samuele.
participants (1)
-
Samuele Pedroni