Silly function call lookup stuff?

Tom Anderson twic at urchin.earth.li
Wed Sep 28 18:21:55 CEST 2005


On Tue, 27 Sep 2005, Dan Sommers wrote:

> On Wed, 28 Sep 2005 00:38:23 +0200,
> Lucas Lemmens <llemmens at gmx.net> wrote:
>
>> On Tue, 27 Sep 2005 13:56:53 -0700, Michael Spencer wrote:
>
>>> Lucas Lemmens wrote:
>
>>>> Why isn't the result of the first function-lookup cached so that 
>>>> following function calls don't need to do the function-lookup at all?
>>>
>>> I guess because the function name may be re-bound between loop 
>>> iterations.  Are there good applications of this?  I don't know.
>>
>> Yuk I'd hate that. I think it would be extremely rare.
>
> With duck typing, I think it would be fairly common:
>
>    def process_list_of_things_to_process( list_of_things_to_process ):
>        for x in list_of_things_to_process:
>            x.process( )

That's not what's being talked about here. In this code, x.process would 
be a different method each time through, and wouldn't be cached (this 
isn't C++, you know!).

We're talking about this case:

class Foo:
 	def process(self):
 		return "foo's version of process"

def bar(foo):
 	foo.process = lambda: "bar's version of process"

x = Foo()
print x.process()
bar(x)
print x.process()

Naive local method cacheing would get this wrong. Worldly-wise method 
cacheing that would get this right would be a nightmare to implement.

A better bet might be to speed up method lookup. I should say that i know 
bugger all about how python's method lookup is implemented now, but i 
believe that it's basically a dict lookup in a per-object feature 
dictionary (ie self.__dict__). It might be possible to instead use a sort 
of vtable, with a translucent layer of indirection wrapped round it to 
allow for python's dynamism.

Okay, thinking out loud follows. The following is pseudocode showing how 
the interpreter is implemented; any resemblance to an actual programming 
language, living or dead, is purely coincidental.

The current implementation looks something like this:

def classmembers(cl):
 	<returns an iterator over the members of a class>

def new(cl):
 	"Make a new instance of a class cl. An object is a pair (cl, 
members), where cl is its class and members is a dict of its members."
 	members = {}
 	for name, member in classmembers(cl):
 		members[name] = member
 	obj = (cl, members)
 	return obj

def lookup(obj, name):
 	members = obj[1]
 	member = members[name]
 	return member

def bind(obj, name, member):
 	members = obj[1]
 	members[name] = member

Since the members dict is mutable, there's nothing that can be cached 
here. This is what i'd suggest ...

type mtuple:
 	<a mutable tuple - fixed length, but mutable; basically a C array>

def new(cl):
 	index = {}
 	members = [cl, index]
 	for name, member in classmembers(cl):
 		index[name] = len(members)
 		members.append(member)
 	obj = (cl, index, mtuple(members))
 	return obj

# the index dict is *never* modified by any code anywhere

def lookup(obj, name):
 	index = obj[1]
 	offset = index[name]
 	value = obj[offset]
 	return value

def bind(obj, name, value):
 	index = obj[1]
 	offset = index[name]
 	obj[offset] = value

So far, this wouldn't be any faster; in fact, it would be slightly slower, 
due to the extra layer of indirection.

However, now we can expose a little of the lookup mechanism to the 
execution machinery:

def offset(obj, name):
 	index = obj[1]
 	offset = index[name]
 	return offset

def load(obj, offset):
 	value = obj[offset]
 	return value

And refactor:

def lookup(obj, name):
 	offset = offset(obj, name)
 	value = load(obj, offset)
 	return value

We now have something cachable. Given code like:

while (foo()):
 	x.bar()

The compiler can produce code like:

_bar = offset(x, "bar")
while (foo()):
 	load(x, _bar)()

It has to do the lookup in the dict once, and after that, just has to do a 
simple load. The crucial thing is, if the method gets rebound, it'll be 
rebound into exactly the same slot, and everything keeps working fine.

Note that the code above doesn't handle binding of members to names that 
weren't defined in the class; it thus doesn't support dynamic extension of 
an object's interface, or, er, member variables. However, these are fairly 
simple to add, via an extra members dict (which i create lazily):

def new(cl):
 	index = {}
 	extras = None
 	members = [cl, index, extras]
 	for name, member in classmembers(cl):
 		index[name] = len(members)
 		members.append(member)
 	obj = (cl, index, mtuple(members))
 	return obj

def lookup(obj, name):
 	index = obj[1]
 	try:
 		offset = index[name]
 		value = obj[offset]
 	except KeyError:
 		extras = obj[2]
 		if (extras == None): raise KeyError, name
 		value = extras[name]
 	return value

def bind(obj, name, value):
 	index = obj[1]
 	try:
 		offset = index[name]
 		obj[offset] = value
 	except KeyError:
 		extras = obj[2]
 		if (extras == None):
 			extras = {}
 			obj[2] = extras
 		extras[name] = value

It also doesn't call __init__ on new objects, or do metaclasses properly, 
or support __slots__, or any of that jazz, but that would all be fairly 
easy to add. __slots__ would be a big performance win here.

It would be really nice if the object could somehow be put together after 
__init__ has run; that way, the member variables set there could be 
included in the members which are stored in the mtuple and indexed, which 
is a big performance and compactness win over having them as extras. This 
is probably possible, using objects which can invisibly change their 
implementation, but i'm not going to think about it now!

Also, as a memory optimisation, you'd want to arrange for all instances of 
a class to share the same index dictionary instance. That would be pretty 
trivial. You could also split the members which are likely to be constant 
across all instances - the class reference, index, and the methods, but 
not the extras - into a separate table, and share that too; you would have 
to handle rebinding of members on individual objects with a copy-on-write 
policy applied to this table. This is fairly straightforward to do, but 
i'm not going to do it here; the result would be something very close to 
C++-style vtables, but supporting python's object model.

Did that make any sense?

Speaking of __slots__, can you use slots for methods? Does it speed up 
lookup at all?

tom

-- 
Children are born true scientists. -- R. Buckminster Fuller



More information about the Python-list mailing list