Python optimization (was Python's "only one way to do it" philosophy isn't good?)

Michele Simionato michele.simionato at gmail.com
Tue Jun 12 00:21:49 EDT 2007


On Jun 10, 6:43 pm, John Nagle <n... at animats.com> wrote:
> Josiah Carlson wrote:
> > Steven D'Aprano wrote:
>
> >> On Sat, 09 Jun 2007 22:52:32 +0000, Josiah Carlson wrote:
>
> >>> the only thing that optimization currently does in Python at present
> >>> is to discard docstrings
>
> >> Python, or at least CPython, does more optimizations than that. Aside
> >> from
> >> run-time optimizations like interned strings etc., there are a small
> >> number of compiler-time optimizations done.
>
> >> Running Python with the -O (optimize) flag tells Python to ignore
> >> assert statements. Using -OO additionally removes docstrings.
> ...
>
> > I would guess it is because some other data types may have side-effects.
> >  On the other hand, a peephole optimizer could be written to trim out
> > unnecessary LOAD_CONST/POP_TOP pairs.
>
> >> Some dead code is also optimized away:
>
> > Obviously dead code removal happens regardless of optimization level in
> > current Pythons.
>
> >> Lastly, in recent versions (starting with 2.5 I believe) Python
> >> includes a
> >> peephole optimizer that implements simple constant folding:
>
> > Constant folding happens regardless of optimization level in current
> > Pythons.
> > So really, assert and docstring removals.  Eh.
>
>     It's hard to optimize Python code well without global analysis.
> The problem is that you have to make sure that a long list of "wierd
> things", like modifying code or variables via getattr/setattr, aren't
> happening before doing significant optimizations.  Without that,
> you're doomed to a slow implementation like CPython.
>
>     ShedSkin, which imposes some restrictions, is on the right track here.
> The __slots__ feature is useful but doesn't go far enough.
>
>     I'd suggest defining "simpleobject" as the base class, instead of "object",
> which would become a derived class of "simpleobject".   Objects descended
> directly from "simpleobject" would have the following restrictions:
>
>         - "getattr" and "setattr" are not available (as with __slots__)
>         - All class member variables must be initialized in __init__, or
>           in functions called by __init__.  The effect is like __slots__,
>           but you don't have to explictly write declarations.
>         - Class members are implicitly typed with the type of the first
>           thing assigned to them.  This is the ShedSkin rule.  It might
>           be useful to allow assignments like
>
>                 self.str = None(string)
>
>           to indicate that a slot holds strings, but currently has the null
>           string.
>         - Function members cannot be modified after declaration.  Subclassing
>           is fine, but replacing a function member via assignment is not.
>           This allows inlining of function calls to small functions, which
>           is a big win.
>         - Private function members (self._foo and self.__foo) really are
>           private and are not callable outside the class definition.
>
> You get the idea.  This basically means that "simpleobject" objects have
> roughly the same restrictions as C++ objects, for which heavy compile time
> optimization is possible.  Most Python classes already qualify for
> "simpleobject".  And this approach doesn't require un-Pythonic stuff like
> declarations or extra "decorators".
>
> With this, the heavy optimizations are possible.  Strength reduction.  Hoisting
> common subexpressious out of loops.  Hoisting reference count updates out of
> loops.  Keeping frequently used variables in registers.  And elimination of
> many unnecessary dictionary lookups.
>
> Python could get much, much faster.  Right now CPython is said to be 60X slower
> than C.  It should be possible to get at least an order of magnitude over
> CPython.
>
>
                                        John Nagle

This is already done in RPython:

http://codespeak.net/pypy/dist/pypy/doc/coding-guide.html#restricted-python

I was at the PyCon It conference the other day and one of the
PyPy people claimed that RPython is up to 300X faster than Python.

    Michele Simionato




More information about the Python-list mailing list