Towards faster Python implementations - theory

John Nagle nagle at animats.com
Wed May 9 13:02:15 EDT 2007


Paul Boddie wrote:
> On 9 May, 08:09, "Hendrik van Rooyen" <m... at microcorp.co.za> wrote:
> 
>>I am relatively new on this turf, and from what I have seen so far, it
>>would not bother me at all to tie a name's type to its first use, so that
>>the name can only be bound to objects of the same type as the type
>>of the object that it was originally bound to.
> 
> 
> But it's interesting to consider the kinds of names you could restrict
> in this manner and what the effects would be. In Python, the only kind
> of name that can be considered difficult to arbitrarily modify "at a
> distance" - in other words, from outside the same scope - are locals,
> and even then there are things like closures and perverse
> implementation-dependent stack hacks which can expose local namespaces
> to modification, although any reasonable "conservative Python"
> implementation would disallow the latter.

     Modifying "at a distance" is exactly what I'm getting at.  That's the
killer from an optimizing compiler standpoint.  The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on.  But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening.  This is tough on both optimizing
compilers and maintenance programmers.

     Python has that capability mostly because it's free in an
"everything is a dictionary" implementation.  ("When all you have
is a hash, everything looks like a dictionary".)  But that limits
implementation performance.  Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away.  But the cost for that flexibility is
being paid, unnecessarily.

     I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

     One option might be a class "simpleobject", from which other classes
can inherit.  ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

	- New fields and functions cannot be introduced from outside
	the class.  Every field and function name must explicitly appear
	at least once in the class definition.  Subclassing is still
	allowed.
	- Unless the class itself uses "getattr" or "setattr" on itself,
	no external code can do so.  This lets the compiler eliminate the
	object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation.  Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance.  Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

					John Nagle



More information about the Python-list mailing list