[summerofcode] High-level python optimizer.

Fri Jun 10 11:44:58 CEST 2005

Hi Vova,

It's very hard to prove anything about a program without a precise
type analysis. As Armin mentioned, implementing this alone, for
Python, represents quite an effort. But it is the most important thing
for optimizing pure Python: you can't specify types in Python (without
messing up your source code, or using Pyrex), but you can optimize
other things by hand. In my case, it's even more important, because
conversion to C++ essentially gives you most of the optimizations you
mention for free.  A really high-level optimization that remains in
this case is for example trying to allocate things on the stack, or
even statically preallocate them (because you do this manually in
C++). Other examples of high-level optimizations imo are for example
optimization of bounds checking, optimizing memory layout in many
ways, and removing 'intermediate lists' as in functional language
compilers.

But, I don't think a global approach can work well for even moderately
dynamic programs, as it's very hard to determine precisely what is
going on. (Consider just the use of reflection - it's formally
undecidable what kind of attribute names are involved, even without
user input.) A global approach then, can imo best consider a subset of
Python, and translate code into a language such as C++ (as it operates
at the same abstraction level as Python, and you get many
optimizations for free.) I really like the elegance of such a
solution: you can reuse lots of work done in for example GCC, and you
get the ultimate in speed for relatively 'static' programs - which
many people, or at least myself, mostly write. I love the syntax of
Python, but I don't really use for example dynamic typing for most of
the time (it's just convenient to be able to leave out type
declarations :D)

On the other hand, a non-global approach I think cannot do many
high-level optimizations, as it doesn't have enough information or
(when done during run-time) does not have the time to calculate the
right information. A non-global, run-time approach does seem like the
best long-term solution. With Moore's law and all, it can optimize
most code to an acceptable level, and still be able to work with the
full Python semantics.

So, I am left wondering a bit about the approach you would like to
take. If you take the global route, then why not convert to C++. But
that would be sort of the same thing as Starkiller and Shedskin (my
compiler) aim to do. If you take the non-global route, then real
high-level optimizations seem out of the question, or at least very
complicated? Maybe it would be best to look into the things Armin Rigo
and the others are doing with PyPy, perhaps find something you can
contribute there?

Finally, I have no idea whether the source code for Starkiller is
available, or if so, if it's readable. The website seems gone, and I
don't know what happened. Well, I know my source code is available,
but not very readable -^ (just mail me, and I'll send you this 6000
line can of worms) - I'll be probably still modifying it heavily for
the foreseeable future..

Mark. 

On 6/9/05, Vova <physics at list.ru> wrote:
> On Wednesday 08 June 2005 23:59, Brett C. wrote:
> > Vova wrote:
> > > Hello !
> > > This is my project proposal: High-level python optimizer.
> > >
> > > 1. Pre-history.
> > > Python optimization is difficult task because python's dynamic nature.
> > > Applying many of optimization techniques used in other (static typed)
> > > languages to python is almost impossible. There was an attempt to create
> > > python optimizer that works on the byte-code
> > > (http://www.foretec.com/python/workshops/1998-11/proceedings/papers/monta
> > >naro/montanaro.html), but it speeds-up execution only by a few percent.
> > > Different approach is taken in the psyco project. It implements JIT
> > > compilation using run-time specialization. The results is very
> > > impressive. But psyco does only simple local run-time optimizations, it
> > > can't look at the whole program.
> >
> > The other issue with Psyco is it only works with x86.  Don't forget Python
> > does run on other platforms so getting optimizations for other
> > architectures is a very nice thing.
> >
> > > 2. Project proposal.
> > > My proposal is to create high-level optimizer, that will work on the
> > > source tree of the whole program. It will transform the source tree to
> > > data-flow graph that represents the flow of the data from program input
> > > (in general sence) to output. The goal is elimination of all
> > > unnecessarily dynamics i.e. making assumptions about actual types (and
> > > may be values) of variables and their life-times. Of course this can't be
> > > done for any variable in the program, but (usually) for most of them
> > > (just try to open any python program and traverse it from the very
> > > beginning visiting any called function). Once we have got this
> > > information we can apply a lot of traditional well-known optimization
> > > techniques as well as python-specific ones. Then we can compile and run
> > > optimized tree, or transform it back to python program. We can still use
> > > psyco for JIT, and maybe we can even provide psyco with gathered
> > > information about variable types.
> >
> > How are you planning to actually implement performance enhancements?  It is
> > one thing to get complete control flow information in a CFG and do the data
> > analysis, but Python's bytecode is in no way going to help you in terms of
> > performance improvement from this information with its lack of
> > type-specific bytecodes short of generating built-in types and LIST_APPEND.
> >
> > Are you going to add new bytecodes?
> No.
> 
> My project will use type information to do high-level code optimization (see
> below). Type inference is the only one part of analysis. The aim of the
> analysis is to gain the following information (when it possible):
> 1. Variable types.
> 2. Variable values.
> 3. Function side-effects.
> 
> Item 3 depends on 1. Example:
> def sum(a,b): return a+b
> sum(a,b) has side-effects iff a.__add__ has side-effects.
> Another example:
> a.some_method(5)
> here we must guess the type of a, then check a.__getitem__ for side-effects,
> and then check a.some_method(5) for side_effects.
> 
> Let's look, how can we use this information to optimize program:
> 
> 1. Constant-folding and math simplification can be done for built-in types. If
> some variables have definite values we can use it directly.
> 
> 2. Function inlining (which can be significant in python) can be done if we
> know what function is being called. For method inlining we must know the type
> of the object and check __getitem__ for side-effects.
> 
> 3. Partial evaluation can be done if some variables have known types and/or
> values. This can be done for user function if we have proven that it has no
> side-effects (for example replacing calls to factorial(10) or sin(10) with
> their results at compile-type), and for python specific operations, for
> example replacing "%some_pattern" % (some_list) by more efficient code.
> 
> 4. Dead code elimination can be done as part of parial evaluation. Unused
> variable elimination (i.e. elimination of function calls if their result is
> unused) can be done too using information about side-effects. If variable is
> used only in one branch of execution it assignment can be moved into that
> branch.
> 
> 5. Common subexpression elimination (as well as moving common code outside
> loops) can be done if subexpression has no side-effects. This is even more
> actual in python than in static-typed languages because the code
> the.some.method(x)
> the.some.method(y)
> or
> for x in some_big_list: the.some.method(x)
> is very common. the.__getitem__("some").__getitem__("method") is the common
> code.
> 
> 6. There are also some python-specific optimization, for example replacing
> range with xrange when possible, replacing the code
> (a,b)=(b,a)
> with the code
> t=a; a=b; b=t
> and so on.
> 
> All of this is high-level optimization, it should be used on top of the
> low-level optimizations. Of course all of this can be done by hands if
> programmer is clever enough, but sometimes the price for it is decreasing the
> code readability, abusing incapsulation, and so on, the result is increased
> maintenance cost.
> 
> The result of the optimizer could be either optimized python source or
> compiled program. In the former case it can be then compiled by Jython,
> IronPython or PyPy.
> 
> As I said before, this optimizations will give the best results it applied to
> the whole program with all of it's modules. The problem with compile-time to
> run-time integrity can be partly solved by checking used modules just before
> running the program and recompiling the program if needed. Maybe we can save
> taken assumptions about module's functions and check if that assumptions had
> changed, so full recompilation will be needed only rarely. In real-world
> applications only the small amount of modules actually changes often.
> The results of optimization of stand-alone modules can be improved by adding
> type annotations for module's functions.
> 
> This project is different from python-to-C++ compilation (of course some
> correlations present). C++ can't represent a lot of python features, but
> optimized python program preserves all of them. Optimizer touches only the
> code that can be improved.
> 
> Of course full implementation of all mentioned optimizations (and maybe even
> more) will take more than two month. But two month is sufficient to implement
> general framework and some (the most promising) optimization and test it. Of
> course the work done by others in the field of type inference algorithms can
> be very helpful if it is open.
> 
> 
> --
>               Best regards,
>                    Vladimir
> 
> > Are you going to do post-compile
> > transformations like Raymond Hettinger's decorator to make built-ins act as
> > constants?
> >
> > And I hope you do realize it sounds like you are breaking semantics of the
> > language by assuming the code you analyze is the code that gets executed.
> > You can read my thesis (http://www.drifty.org/thesis.pdf ; chapter 3) to
> > get the details.  It's fine if you are, but you should make sure you make
> > that clear in your proposal.
> >
> > To answer Mark Dufour's response here as well, Starkiller is not under
> > active development anymore by Mike (at least to my knowledge).  And
> > Starkiller in no way optimized Python but just performed a type inference
> > for an entire Python application.
> >
> > -Brett
> _______________________________________________
> summerofcode mailing list
> summerofcode at python.org
> http://mail.python.org/mailman/listinfo/summerofcode
> 

-- 
if vars: self.output('; '.join([self.nodetype(var)+' '+name for
(name,var) in vars.items()])+';')