Re: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0329.txt, 1.2, 1.3

22 Apr 2004

      At 11:56 AM 4/22/04 -0400, Jeremy Hylton wrote:
...
...
I could be wrong, but it seems to me that globals shouldn't be nearly as
bad for performance as builtins.  A global only does one dict lookup, 
while
builtins do two.  Also, builtins can potentially be optimized away
altogether (e.g. 'while True:') or converted to fast LOAD_CONST, or
On Wed, 2004-04-21 at 10:50, Phillip J. Eby wrote:
perhaps
...
even a new CALL_BUILTIN opcode, assuming that adding the opcode doesn't
blow the cacheability of the eval loop.
The coarse measurements I made a couple of years ago suggest that
LOAD_GLOBAL is still substantially slower than LOAD_FAST.  Less than 100
cycles for LOAD_FAST and about 400 cycles for LOAD_GLOBAL.
http://zope.org/Members/jeremy/CurrentAndFutureProjects/PerformanceMeasureme...
I notice the page says 400 cycles "on average" for LOAD_GLOBAL doing "one 
or two dictionary lookups", so I'm curious how many of those were for 
builtins, which in the current scheme are always two lookups.  If it was 
half globals and half builtins, and the dictionary lookup is half the time, 
then having opcodes that know whether to look in globals or builtins would 
drop the time to 266 cycles, which isn't spectacular but is still good at 
only about 3.5 times the bytecode fetch overhead.  If builtins are used 
more frequently than globals, the picture improves still further.

Still, it's very interesting to see that loading a global takes almost as 
much time as calling a function!  That's pretty surprising to me.  I guess 
that's why doing e.g. '_len=len' for code that does a tight loop makes such 
a big difference to performance.  I tend to do that with attribute lookups 
before a tight loop, e.g. 'bar = foo.bar', but I didn't realize that global 
and builtin lookups were almost as slow.