On Aug 23, 2006, at 2:22 PM, K.S.Sreeram wrote:
Hi all,
I noticed in Python/ceval.c that LOAD_GLOBAL uses a dictionary lookup, and was wondering if that can be optimized to a simple array lookup.
If i'm right there are 3 kinds of name lookups: locals, outer scopes(closures), and globals. (not counting attribute lookup). Locals are identified by, either the presence of assignments, or their presence in the arg list. So all name lookups can be classified into the 3 types at compile/load time.
Since we know, at load time, which names are global.. Can't we simply build a global name table and replace LOAD_GLOBALs with a lookup at the corresponding index into the global name table?
At the time the function's body gets compiled, the global (or builtin) it's trying to access might or might not be there -- as long as it gets added afterwards, before the function's body gets _executed_, no problem (in today's scheme). It's not obvious to me how you could compile a ``corresponding index'' into the LOAD_GLOBAL opcode, since that index is in general unknown at compile time.
The module's dict object will need to be special so that whenever a name gets 'set', the global name table should get updated.
It seems that you'd need to chase down and modify all of the LOAD_GLOBAL opcodes too, at every such modification. (the concept of modifying builtins becomes extremely scary...). Considering the amortized speed of a dict lookup for an interned string (hash value cached and thus immediately available, equality comparison with other interned string a single machine-level operation), it's not clear to me that the huge complexity (and potential performance impact) of all this could ever possibly be justified. A change in Python semantics allowing some level of "nailing down" of builtins (and perhaps globals too) *COULD* easily yield large performance benefits, but that's a subject for the Python-3000 mailing list; as long as builtins and globals stay as fluid as today, I'm skeptical on the optimization opportunities here. Alex