[summerofcode] High-level python optimizer.

Vova physics at list.ru
Fri Jun 10 14:51:53 CEST 2005


Hello !
Again, just to clarify. My program will do global analysis of the source 
building data-flow graph. This graph makes possible to predict types and 
values of some variables and check functions for side-effects (depending on 
argument types). Using this information optimizer can make some good 
optimizations. Being well implemented it gives the following guarantees:
1. This will work for ANY python program.
2. Optimized program will always work the same as the original program.
Optimizer will touch only parts that it can optimize. By supplying optimizer 
with some user-created information results can be improved, but this 
information is not required.

The aims of this optimizer is to:
1. Allow programmers not to sacrifice readability and architecture of the code 
in favor of performance.
2. Allow programmers to use some nice language features (functional 
programming, for example) not thinking that this features are inefficient.
3. Allow improving old programs using new language features (for example using 
generators instead of lists).
4. Discover more about program-transformation algorithms.

Compiling code to C++ does not give the same guarantees. It will work only for 
static programs, only for subset of python. How many real-world programs are 
in this subset ? But of course seep-up of such compilation will be greater. I 
think the aim of such compiler is to allow using static subset of python 
syntax to write fast programs (am I wrong?).

Our projects is just different ones, however some correlations of course 
present. May be we can even help each other in overlapping areas.

-- 
          Best regards,
            Vladimir

On Friday 10 June 2005 13:44, Mark Dufour wrote:
> Hi Vova,
>
> It's very hard to prove anything about a program without a precise
> type analysis. As Armin mentioned, implementing this alone, for
> Python, represents quite an effort. But it is the most important thing
> for optimizing pure Python: you can't specify types in Python (without
> messing up your source code, or using Pyrex), but you can optimize
> other things by hand. In my case, it's even more important, because
> conversion to C++ essentially gives you most of the optimizations you
> mention for free.  A really high-level optimization that remains in
> this case is for example trying to allocate things on the stack, or
> even statically preallocate them (because you do this manually in
> C++). Other examples of high-level optimizations imo are for example
> optimization of bounds checking, optimizing memory layout in many
> ways, and removing 'intermediate lists' as in functional language
> compilers.
>
> But, I don't think a global approach can work well for even moderately
> dynamic programs, as it's very hard to determine precisely what is
> going on. (Consider just the use of reflection - it's formally
> undecidable what kind of attribute names are involved, even without
> user input.) A global approach then, can imo best consider a subset of
> Python, and translate code into a language such as C++ (as it operates
> at the same abstraction level as Python, and you get many
> optimizations for free.) I really like the elegance of such a
> solution: you can reuse lots of work done in for example GCC, and you
> get the ultimate in speed for relatively 'static' programs - which
> many people, or at least myself, mostly write. I love the syntax of
> Python, but I don't really use for example dynamic typing for most of
> the time (it's just convenient to be able to leave out type
> declarations :D)
>
> On the other hand, a non-global approach I think cannot do many
> high-level optimizations, as it doesn't have enough information or
> (when done during run-time) does not have the time to calculate the
> right information. A non-global, run-time approach does seem like the
> best long-term solution. With Moore's law and all, it can optimize
> most code to an acceptable level, and still be able to work with the
> full Python semantics.
>
> So, I am left wondering a bit about the approach you would like to
> take. If you take the global route, then why not convert to C++. But
> that would be sort of the same thing as Starkiller and Shedskin (my
> compiler) aim to do. If you take the non-global route, then real
> high-level optimizations seem out of the question, or at least very
> complicated? Maybe it would be best to look into the things Armin Rigo
> and the others are doing with PyPy, perhaps find something you can
> contribute there?
>
> Finally, I have no idea whether the source code for Starkiller is
> available, or if so, if it's readable. The website seems gone, and I
> don't know what happened. Well, I know my source code is available,
> but not very readable -^ (just mail me, and I'll send you this 6000
> line can of worms) - I'll be probably still modifying it heavily for
> the foreseeable future..
>
>
> Mark.
>


More information about the summerofcode mailing list