A modest proposal
Stuart D. Gathman
stuart at bmsi.com
Fri Nov 16 01:29:24 CET 2001
I have been putting my thoughts toward the elusive goal of making python
as fast as some LISP dialects.
As a result, I have a proposal that could drastically speed up the common
cases of accessing global variables, while slightly slowing down less
common accesses, with some rare pathologies that will eat time and memory.
(Such cases would, of course, become even more rare should my proposal
ever get implemented :-) Please comment on why this is a terrible/great
idea, has already been thought of, etc. etc.
The basic idea is to split namespace dictionaries into two parts, a simple
C array of PyObject pointers, and a SymbolTable which maps names to
indexes into the C array. A SymbolTable can grow, but never shrink.
Deleting a name from the namespace is represented by storing a 0 into the
C PyObject array. The C array is automatically enlarged as required when
adding names to the dictionary (call it a SymbolDictionary).
In the general case, access is slightly slower due to the extra index into
the C array, and memory could be wasted due to deleted names, which still
occupy a (0 value) slot in the array and SymbolTable. However, all python
semantics are preserved.
What does this gain us?
For starters, when compiling a module, references to module globals can be
compiled to a prebuilt SymbolTable in the bytecode. Outside of functions,
references to module globals can use an indexed bytecode similar to local
variables. References to globals in imported modules cannot safely be
converted to indexed lookups because the module global might be replaced
with some other module or object.
Of course, this is not a big win unless functions and methods can use this
optimization. The problem is, functions can be called with varying
namespaces (because they are assigned to multiple modules, for instance).
I propose that functions contain a dictionary or cache of namespaces used
to call that function. The first time a function is called for a
namespace not in the cache, its bytecodes are optimized for that namespace
by converting global references to indexes in that namespaces C array. The
optimized code is used and saved.
For classes, the SymbolTable for class instances can be stored with the
class, and only the C array kept with class instances. It might be
helpful to provide an overflow dictionary for class instances to handle
the case where instances are used as a container for values with arbitrary
identifiers - otherwise the instance C arrays would have lots of 0 entries
with that kind of usage.
In normal python code, a method is only part of one class, and once it is
optimized for that classes namespace, it can access instance variables
almost as quickly as local vars. If it is assigned to multiple classes,
the caching mechanism takes over.
Single inheritance can be optimized by making derived class SymbolTables
"extend" the first base class - i.e. keep the same indexes - and make the
namespace caching smart enough to recognize this case and not reoptimize
base class methods for the derived class.
Of course, I can envision programming styles involving using functions
with many different namespaces - all needing their own copy of optimized
code. This can be mitigated by falling back to the unoptimized code when
there are two many namespaces for a given function.
If SymbolTables support mapping indexes to names, as well as names to
indexes, then a function optimized for one namespace can be easily
optimized for another, but looking up the name for each indexed global in
the old namespace and looking up the index in the new namespace. (But
name errors need to be delayed somehow.)
I am late and need to go, but I just realized now to avoid the function
caching and still reap the benefits of this scheme! So until next
Stuart D. Gathman <stuart at bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flamis acribus addictis" - Mozart background
song for the Microsoft "Where do you want to go from here?" commercial.
More information about the Python-list