Can LOAD_GLOBAL be optimized to a simple array lookup?
Hi all, I noticed in Python/ceval.c that LOAD_GLOBAL uses a dictionary lookup, and was wondering if that can be optimized to a simple array lookup. If i'm right there are 3 kinds of name lookups: locals, outer scopes(closures), and globals. (not counting attribute lookup). Locals are identified by, either the presence of assignments, or their presence in the arg list. So all name lookups can be classified into the 3 types at compile/load time. Since we know, at load time, which names are global.. Can't we simply build a global name table and replace LOAD_GLOBALs with a lookup at the corresponding index into the global name table? The module's dict object will need to be special so that whenever a name gets 'set', the global name table should get updated. Is this optimization possible or have i missed something? cheers [sreeram;]
On 8/23/06, K.S.Sreeram
Hi all,
I noticed in Python/ceval.c that LOAD_GLOBAL uses a dictionary lookup, and was wondering if that can be optimized to a simple array lookup.
No, not as the language stands now. If i'm right there are 3 kinds of name lookups: locals, outer
scopes(closures), and globals. (not counting attribute lookup). Locals are identified by, either the presence of assignments, or their presence in the arg list. So all name lookups can be classified into the 3 types at compile/load time.
Since we know, at load time, which names are global.. Can't we simply build a global name table and replace LOAD_GLOBALs with a lookup at the corresponding index into the global name table?
But we don't know statically what the globals will be. You can import a module and put something in its global namespace externally. That is done after load time or compile time. -Brett
2006/8/24, Brett Cannon
On 8/23/06, K.S.Sreeram
wrote: Hi all,
I noticed in Python/ceval.c that LOAD_GLOBAL uses a dictionary lookup, and was wondering if that can be optimized to a simple array lookup.
No, not as the language stands now.
If i'm right there are 3 kinds of name lookups: locals, outer scopes(closures), and globals. (not counting attribute lookup). Locals are identified by, either the presence of assignments, or their presence in the arg list. So all name lookups can be classified into the 3 types at compile/load time.
Since we know, at load time, which names are global.. Can't we simply build a global name table and replace LOAD_GLOBALs with a lookup at the corresponding index into the global name table?
But we don't know statically what the globals will be. You can import a module and put something in its global namespace externally. That is done after load time or compile time.
I think that it can be implemented for the language as it stands now. I don't know whether it will be a good thing or not. In principle, you can add a feature to dict implementation that will allow it to notify when the value of a specific key changes. If you have that, you can change LOAD_GLOBAL implementation to: 1. look for the global. 2. ask for notification when the global dict changes in a way that will change the meaning of the global. 3. change the LOAD_GLOBAL opcode to something like LOAD_CONST, and set the notification from the dict to update the LOAD_CONST opcode to the new object. In that way, LOAD_GLOBAL will cause a dict lookup only once. Changing the value of globals will require more work, though. Again, I'm not saying that it's desired, just that it's possible. Have a good day, Noam
On Aug 23, 2006, at 2:22 PM, K.S.Sreeram wrote:
Hi all,
I noticed in Python/ceval.c that LOAD_GLOBAL uses a dictionary lookup, and was wondering if that can be optimized to a simple array lookup.
If i'm right there are 3 kinds of name lookups: locals, outer scopes(closures), and globals. (not counting attribute lookup). Locals are identified by, either the presence of assignments, or their presence in the arg list. So all name lookups can be classified into the 3 types at compile/load time.
Since we know, at load time, which names are global.. Can't we simply build a global name table and replace LOAD_GLOBALs with a lookup at the corresponding index into the global name table?
At the time the function's body gets compiled, the global (or builtin) it's trying to access might or might not be there -- as long as it gets added afterwards, before the function's body gets _executed_, no problem (in today's scheme). It's not obvious to me how you could compile a ``corresponding index'' into the LOAD_GLOBAL opcode, since that index is in general unknown at compile time.
The module's dict object will need to be special so that whenever a name gets 'set', the global name table should get updated.
It seems that you'd need to chase down and modify all of the LOAD_GLOBAL opcodes too, at every such modification. (the concept of modifying builtins becomes extremely scary...). Considering the amortized speed of a dict lookup for an interned string (hash value cached and thus immediately available, equality comparison with other interned string a single machine-level operation), it's not clear to me that the huge complexity (and potential performance impact) of all this could ever possibly be justified. A change in Python semantics allowing some level of "nailing down" of builtins (and perhaps globals too) *COULD* easily yield large performance benefits, but that's a subject for the Python-3000 mailing list; as long as builtins and globals stay as fluid as today, I'm skeptical on the optimization opportunities here. Alex
Alex Martelli wrote:
On Aug 23, 2006, at 2:22 PM, K.S.Sreeram wrote:
Hi all,
I noticed in Python/ceval.c that LOAD_GLOBAL uses a dictionary lookup, and was wondering if that can be optimized to a simple array lookup.
If i'm right there are 3 kinds of name lookups: locals, outer scopes(closures), and globals. (not counting attribute lookup). Locals are identified by, either the presence of assignments, or their presence in the arg list. So all name lookups can be classified into the 3 types at compile/load time.
Since we know, at load time, which names are global.. Can't we simply build a global name table and replace LOAD_GLOBALs with a lookup at the corresponding index into the global name table?
At the time the function's body gets compiled, the global (or builtin) it's trying to access might or might not be there -- as long as it gets added afterwards, before the function's body gets _executed_, no problem (in today's scheme). It's not obvious to me how you could compile a ``corresponding index'' into the LOAD_GLOBAL opcode, since that index is in general unknown at compile time.
How about this approach: (disclaimer: this is just a rough sketch!) At compile time: --------------- - When compiling the module, a list of global names is built. This list is populated in the order in which the names appear in the module. References to the global will be replaced by references to the index in this list. - Instead of generating LOAD_GLOBAL and STORE_GLOBAL opcodes, we generate LOAD_GLOBAL_BY_INDEX and STORE_GLOBAL_BY_INDEX, which are new opcodes. The indexes used with the opcode will be the index of the name from the list. - This list of names should be stored in the pyc file so that the ordering is remembered. At load time: ------------- - At module load time, we build an array of PyObject*, which is used to store the values for the globals. The values will be NULL, until the actual binding is created. - For each name in the global_name_list in the pyc file, we pass the pointer to the corresponding entry in the global table, to the dict. PyDict_SetLocationForName( module_dict, name, PyObject **ptr_to_entry_table ) This is done so that 'get' and 'set' of these names in the dict will update/fetch from the global table directly. This approach should result in faster access to globals defined within the module. It will still work if we try to shadow any builtins. As far as i see, this doesn't seem to change any existing semantics either. This does not result in faster access to existing builtins. But a similar scheme can be worked out for that too.
The module's dict object will need to be special so that whenever a name gets 'set', the global name table should get updated.
It seems that you'd need to chase down and modify all of the LOAD_GLOBAL opcodes too, at every such modification. (the concept of modifying builtins becomes extremely scary...). Considering the amortized speed of a dict lookup for an interned string (hash value cached and thus immediately available, equality comparison with other interned string a single machine-level operation), it's not clear to me that the huge complexity (and potential performance impact) of all this could ever possibly be justified.
A change in Python semantics allowing some level of "nailing down" of builtins (and perhaps globals too) *COULD* easily yield large performance benefits, but that's a subject for the Python-3000 mailing list; as long as builtins and globals stay as fluid as today, I'm skeptical on the optimization opportunities here.
[sreeram;]
Zitat von "K.S.Sreeram"
How about this approach: (disclaimer: this is just a rough sketch!)
This is actually the problem. There are many fine details which can affect performance and correctness. So about there are only two sensible ideas to treat such ideas: either implement them, or ignore them. If you think your approach could work, please try to implement it. It's not that anybody is objecting the goal; there is just debate about the feasibility of the implementation. So if you can come up with an implementation, we are in a much better position to praise or criticise the approach: it then becomes possible to study whether the implementation is really compatible with the current Python, and whether it really does improve performance. Regards, Martin
Note that there are already three PEPs related to speeding dict-based namespace access; start with: http://www.python.org/dev/peps/pep-0280/ which references the other two. The "normal path" through dict lookups got faster since there was a rash of those, to the extent that more complication elsewhere got much less attractive. It's possible that dict lookups got slower again since /then/, though.
Tim Peters wrote:
Note that there are already three PEPs related to speeding dict-based namespace access; start with:
http://www.python.org/dev/peps/pep-0280/
which references the other two.
The "normal path" through dict lookups got faster since there was a rash of those, to the extent that more complication elsewhere got much less attractive. It's possible that dict lookups got slower again since /then/, though.
Thanks! Looks like i'm only proposing what is already there in pep-267 by Jeremy Hylton. http://www.python.org/dev/peps/pep-0267/ cheers [sreeram;]
participants (6)
-
Alex Martelli
-
Brett Cannon
-
K.S.Sreeram
-
martin@v.loewis.de
-
Noam Raphael
-
Tim Peters