[Python-Dev] Python Specializing Compiler

Armin Rigo arigo@ulb.ac.be
Mon, 25 Jun 2001 15:45:20 +0200


At 14:59 22.06.2001 +0200, Samuele Pedroni wrote:
>*: some possible useful hooks would be:
>- minimal profiling support in order to specialize only things called often
>- feedback for dynamic changing of methods, class hierarchy, ... if we want
>to optimize method lookup (which would make sense)
>- a mixed fixed slots/dict layout for instances.

There is one point that you didn't mention, which I believe is important: 
how to handle global/builtin variables. First, a few words about the 
current Python semantics.

* I am sorry if what follows has already been discussed; I am raising the 
question again because it might be important for Psyco. If you feel this 
should better be a PEP please just tell me so. *

Complete lexical scoping was recently added, implemented with "free" and 
"cell" variables. These are only used for functions defined inside of other 
functions; top-level functions use the opcode LOAD_GLOBAL for all non-local 
variables. LOAD_GLOBAL performs one or two dictionary look-up (two if the 
variable is built-in). For simple built-ins like "len" this might be 
expensive (has someone measured such costs ?).

I suggest generalizing the compile-time lexical scoping rules. Let's 
compile all functions' non-local variables (top-level and others) as "free" 
variables. This means the corresponding module's global variables must be 
"cell" variables. This is just what we would get if the module's code was 
one big function enclosing the definition of all the other functions. Next, 
the variables not defined in the module (the built-ins) are "free" 
variables of the module, and the built-in module provides "cell" variables 
for them. Remember that "free" and "cell" variables are linked together 
when the function (or module in this case) is defined (for functions, when 
"def" is executed; for modules, it would be at load-time).

Benefit: not a single dictionary look-up any more; uniformity of treatment.

Potential code break: global variables shadowing built-ins would behave 
like local variables shadowing globals, i.e. the mere presence of a global 
"xyz=..." would forever hide the "xyz" built-in from the module, even 
before the assignment or after a "del xyz". (c.f. UnboundLocalError.)

To think about: what the "global" keyword would mean in this context.

Implementation problems: if we want to keep the module's dictionary of 
global variables (and we certainly do) it would require changes to the 
dictionary implementation (or the creation of a different kind of 
dictionary). One solution is to automatically dereference cell objects and 
raise exceptions upon reading empty cells. Another solution is to turn 
dictionaries into collections of objects that all behave like cell objects 
(so that if "d" is any dictionary, something like "d.ref(key)" would let us 
get a cell object which could be read or written later to actually get or 
set the value associated to "key", and "d[key]" would mean 
"d.ref(key).cell_ref). Well, these are just proposals; they might not be a 
good solution.

Why it is related to Psyco: the current treatment of globals/builtins makes 
it hard for Psyco to statically tell what function we are calling when it 
sees e.g. "len(a)" in the code. We would at least need some help from the 
interpreter; at least hooks called when the module's globals() dictionary 
change. The above proposal might provide a more uniform solution.

Thanks for your attention.