[pypy-dev] Question on the future of RPython

William Leslie william.leslie.ttg at gmail.com
Tue Sep 28 03:55:06 CEST 2010

On 28 September 2010 10:43, Terrence Cole
<list-sink at trainedmonkeystudios.org> wrote:
> On Tue, 2010-09-28 at 01:57 +0200, Jacob Hallén wrote:
>> Monday 27 September 2010 you wrote:
>> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
>> > > Well, I am happy to see that the my interest in a general purpose RPython
>> > > is not as isolated as I was lead to believe :-))
>> > > Thx,
>> >
>> > What I wrote has apparently been widely misunderstood, so let me explain
>> > what I mean in more detail.  What I want is _not_ RPython and it is
>> > _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
>> > visual tool, for example, a plugin to an IDE.  This tool would perform
>> > static analysis on a piece of python code.  Instead of generating code
>> > with this information, it would mark up the python code in the text
>> > display with colors, weights, etc in order to show properties from the
>> > static analysis.  This would be something like semantic highlighting, as
>> > opposed to syntax highlighting.
>> >
>> > I think it possible that this information would, if created and
>> > presented in the correct way, represent the sort of optimizations that
>> > pypy-c-jit -- a full python implementation, not a language subset --
>> > would likely perform on the code if run.  Given this sort of feedback,
>> > it would be much easier for a python coder to write code that works well
>> > with the jit: for example, moving a declaration inside a loop to avoid
>> > boxing, based on the information presented.
>> >
>> > Ideally, such a tool would perform instantaneous syntax highlighting
>> > while editing and do full parsing and analysis in the background to
>> > update the semantic highlighting as frequently as possible.  Obviously,
>> > detailed static analysis will provide far more information than it would
>> > be possible to display on the code at once, so I see this gui as having
>> > several modes -- like predator vision -- that show different information
>> > from the analysis.  Naturally, what those modes are will depend strongly
>> > on the details of how pypy-c-jit works internally, what sort of
>> > information can be sanely collected through static analysis, and,
>> > naturally, user testing.
>> >
>> > I was somewhat baffled at first as to how what I wrote before was
>> > interpreted as interest in a static python.  I think the disconnect here
>> > is the assumption on many people's part that a static language will
>> > always be faster than a dynamic one.  Given the existing tools that
>> > provide basically no feedback from the compiler / interpreter / jitter,
>> > this is inevitably true at the moment.  I foresee a future, however,
>> > where better tools let us use the full power of a dynamic python AND let
>> > us tighten up our code for speed to get the full advantages of jit
>> > compilation as well.  I believe that in the end, this combination will
>> > prove superior to any fully static compiler.
>> The JIT works because it has more information at runtime than what is
>> available at compile time. If the information was available at compile time we
>> could do the optimizations then and not have to invoke the extra complexity
>> required by the JIT. Examples of  the extra information include things like
>> knowing that introspection will not be used in the current evaluation of a
>> loop, specific argument types will be used in calls and that some arguments
>> will be known to be constant over part of the program execution.. Knowing
>> these bits allows you to optimize away large chunks o f the code that
>> otherwise would have been executed.
>> Static analysis assumes that none of the above mentioned possibilities can
>> actually take place. It is impossible to make such assumptions at compile time
>> in a dynamic language. Therefore PyPy is a bad match for people wanting to
>> staically compile subsets of Python. Applying the JIT to RPython code
> Yes, that idea is just dumb.  It's also not what I suggested at all.  I
> can see now that what I said would be easy to misinterpret, but on
> re-reading it, it clearly doesn't say what you think it does.

It does make /some/ sense, I think. From the perspective of the JIT,
operating at interp-level, the app-level python program *is the
biggest part of* the "stuff you don't know about until runtime". That
is, you don't know the program source at translation time, and most of
the information the JIT is supposed to find are app-level constructs
(eg app-level loops).

Of course any such analysis will fall flat in certain cases, like
eval(raw_input(...)). But you should still be able to gather enough
information for most fairly hygenic code.

What sort of analyses did you have in mind?

>>  is not
>> workable, because the JIT is optimized to remove bits of generated assembler
>> code that never shows up in the compilation of RPython code.
>> These are very basic first principle concepts, and it is a mystery to me why
>> people can't work them out for themselves.
> You are quite right that static analysis will be able to do little to
> help an optimal jit.  However, I doubt that in the near term pypy's jit
> will cover all the dark corners of python equally well -- C has been
> around for 38 years and its still got room for optimization.

There are some undesirable things about static analysis, but it can
sure be useful from optimisation, security and reliability
perspectives. There's also code browsing, too; IDEs require a
different (fuzzier) parser, but the question of 'what types does this
object probably have' makes more sense with a little dependent region
analysis. Optimising when you can be fairly confident of the types
involved could be useful. That doesn't really sound like pypy at that
point, though.

William Leslie

More information about the Pypy-dev mailing list