[Jeremy Hylton]
Thanks for good questions and suggestions. Too bad you can't come to dev day. I'll try to post slides before or after the talk -- and update the PEP.
Here are some more wild ideas, probably more thought provoking than useful, but this is really an area where only the profiler knows the truth <wink>.
SP> And Python with modules, data-objects, class/instances, types SP> etc is quite a zoo :(.
And, again, this is a problem. The same sorts of techniques apply to all namespaces. It would be good to try to make the approach general, but some namespaces are more dynamic than others. Python's classes, lack of declarations, and separate compilation of modules means class/instance namespaces are hard to do right. Need to defer a lot of final decisions to runtime and keep an extra dictionary around just in case.
* instance namespaces As I said but what eventually will happen with class/type unification plays a role. 1. __slots__ are obviously a good thing here :) 2. old-style instances and in general instances with a dict: one can try to guess the slots of a class looking for the "self.attr" pattern at compile time in a more or less clever way. The set of compile-time guessed attrs will be passed to MAKE_CLASS which will construct the runtime guess using the union of the super-classes guesses and the compile time guess for the class. This information can be used to layout a dlict. * zoo problem [yes as I said this whole inline cache thing is supossed to trade memory with speed. And the fact that python internal objects are so inhomogeneous/ polymorphic <wink> does not help to keep the amount small, for example having only new-style classes would help] ideally one can assign to each bytecode in a codeobject whose behavior depends/dispatchs on the concrete object "type" a "cache line" (or many, polymorphic inline caches for modern Smalltalk impl does that in the context of the jit) (As long as the GIL is there we do not need per-thread version of the caches) the first entries in the "cache-line" could contain the PyObject type and then a function pointer, so the we would have a common logic like: if PyObjectType(obj) == cache_line.type: cache_line.onType() else: ... then the per-type code could use the rest of the space in cache-line polymorphically to contain type-specific cached "dispatch" info. E.g. the index of a dict entry for the load_attr/set_attr logic on an instance ... Abstractly one can think about a cache-line for a bytecode as the streamlined version in terms of values/or code-pointers of the last time taken path for that bytecode, plus values to check whether the very same path still makes sense. 1. in practice these ideas can perform very poorly 2. this try to address things/internals as they are, 3. Yup, anything on the object layout/behavior side that simplifies this picture probably does a step in the right direction. regards, Samuele.