At 03:02 PM 4/21/04 -0400, Jewett, Jim J wrote:
I had been assuming that class (and instance) attribute resolution would be subject to the same speedup.
Nope, sorry, they're entirely unrelated lookups.
If this is really only about globals and builtins, then you can just initialize each module's dictionary with a copy of builtins. (Or cache them in the module __dict__ on the first lookup, since you know where it would have gone.)
Interesting thought. The same process that currently loads the __builtins__ member could instead update the namespace directly.
There's only one problem with this idea, and it's a big one: 'import *' would now include all the builtins, causing one module's builtins (or changes thereto) to propagate to other modules. Yuck.
It's a pity, because I think this would otherwise be an immediately usable speedup. Indeed, if today you were to use something like:
globals().update( dict([(k,v) for k,v in __builtins__.__dict__.items() if not k.startswith('_')]) )
at the top of a module, you'd get a lot of the speedup benefit. But you really better have an '__all__' definition, then.
This still won't catch updates to builtins, but it will eliminate the failed lookup and the second dictionary lookup.
Actually, I think the language could easily live with the concept that a module's builtins are determined at module import time, given that I'm pushing for them to be determined at *compile* time.
Instead, names that are determined to be builtin are not allowed to be bound via __setattr__, and are never looked up in the globals dictionary.
Some of the bugs that got the global tracking backed out involved changing __builtins__. If you only add to it, then I suppose the current method (which allows shadowing) is a reasonable fallback. It doesn't work so well if you want to remove names from builtin.
The idea is to simply declare that the any builtin used in a module that's known to be a builtin, is allowed to be optimized to the meaning of that builtin. This isn't just for CPython's benefit: Pyrex for example would greatly benefit from knowing whether it's safe to consider e.g. 'len()' a builtin.
In effect, '__builtins__' should be considered an implementation detail, not part of the language, and screwing with it is off limits. In practice, CPython 2.x versions will need to provide backward compatibility because there is code out there that "adds new builtins".
Simplicity. Functions today do only three kinds of lookups: LOAD_CELL(?), LOAD_FAST and LOAD_GLOBAL. LOAD_CELL is an indirect load from a known, specific nested scope. LOAD_FAST loads from an array offset into the current frame object. LOAD_GLOBAL checks globals and then builtins.
It could be converted to LOAD_CELL (or perhaps even LOAD_FAST) if the compiler were allowed to assume no changes in shadowing. (Including an assumption that the same dictionaries will continue to represent the globals and builtin namespaces for this code object.)
Not without also changing the dictionary type as per PEP 267 or PEP 280. LOAD_CELL and LOAD_FAST don't use dictionaries today.