On Thu, 2006-03-09 at 12:00 +0000, Paul Moore wrote:
On 3/9/06, Nick Coghlan email@example.com wrote:
Steven Elliott wrote:
I'm interested in how builtins could be more efficient. I've read over some of the PEPs having to do with making global variables more efficient (search for "global"): http://www.python.org/doc/essays/pepparade.html But I think the problem can be simplified by focusing strictly on builtins.
Unfortunately, builtins can currently be shadowed in the module global namespace from outside the module (via constructs like "import mod; mod.str = my_str"). Unless/until that becomes illegal, focusing solely on builtins doesn't help - the difficulties lie in optimising builtin access while preserving the existing name shadowing semantics.
Is there any practical way of detecting and flagging constructs like the above (remotely shadowing a builtin in another module)? I can't see a way of doing it (but I know very little about this area...).
It may be possible to flag it, or it may be possible it make it work.
In my post I mentioned one special case that needs to be addressed (assigning to __builtins__). What Nick mentioned in his post ("import mod; mod.str = my_str") is another special case that needs to be addressed. If we can assume that all pyc files are compiled with the same set of default bulitins (which should be assured by the by the version in the pyc file) then there are two ways that things like "mod.str = my_str" could be handled.
I believe that currently "mod.str = my_str" alters the module's global hash table (f->f_globals in the code). One way of handling it is to alter STORE_ATTR (op code for assigning to mod.str) to always check to see if the key being assigned is one of the default builtins. If it is, then the module's indexed array of builtins is assigned to.
Alternatively if we also wanted to optimize "mod.str = my_str" then there could be a new opcode like STORE_ATTR that would take an index into the array of builtins instead of an index into the names.
PEP 280, which Nick mentioned, talks about a "cells", a hybrid data structure that can do both hash table lookups and lookups by index efficiently. That's great, but I'm curious if additional gains can be made be focusing just on builtins.