[pypy-dev] Question on the future of RPython

Tue Sep 28 22:33:14 CEST 2010

On Tue, 2010-09-28 at 15:20 +0200, Maciej Fijalkowski wrote:
> On Tue, Sep 28, 2010 at 2:43 AM, Terrence Cole
> <list-sink at trainedmonkeystudios.org> wrote:
> > On Tue, 2010-09-28 at 01:57 +0200, Jacob Hallén wrote:
> >> Monday 27 September 2010 you wrote:
> >> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> >> > > Well, I am happy to see that the my interest in a general purpose RPython
> >> > > is not as isolated as I was lead to believe :-))
> >> > > Thx,
> >> >
> >> > What I wrote has apparently been widely misunderstood, so let me explain
> >> > what I mean in more detail.  What I want is _not_ RPython and it is
> >> > _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
> >> > visual tool, for example, a plugin to an IDE.  This tool would perform
> >> > static analysis on a piece of python code.  Instead of generating code
> >> > with this information, it would mark up the python code in the text
> >> > display with colors, weights, etc in order to show properties from the
> >> > static analysis.  This would be something like semantic highlighting, as
> >> > opposed to syntax highlighting.
> >> >
> >> > I think it possible that this information would, if created and
> >> > presented in the correct way, represent the sort of optimizations that
> >> > pypy-c-jit -- a full python implementation, not a language subset --
> >> > would likely perform on the code if run.  Given this sort of feedback,
> >> > it would be much easier for a python coder to write code that works well
> >> > with the jit: for example, moving a declaration inside a loop to avoid
> >> > boxing, based on the information presented.
> >> >
> >> > Ideally, such a tool would perform instantaneous syntax highlighting
> >> > while editing and do full parsing and analysis in the background to
> >> > update the semantic highlighting as frequently as possible.  Obviously,
> >> > detailed static analysis will provide far more information than it would
> >> > be possible to display on the code at once, so I see this gui as having
> >> > several modes -- like predator vision -- that show different information
> >> > from the analysis.  Naturally, what those modes are will depend strongly
> >> > on the details of how pypy-c-jit works internally, what sort of
> >> > information can be sanely collected through static analysis, and,
> >> > naturally, user testing.
> >> >
> >> > I was somewhat baffled at first as to how what I wrote before was
> >> > interpreted as interest in a static python.  I think the disconnect here
> >> > is the assumption on many people's part that a static language will
> >> > always be faster than a dynamic one.  Given the existing tools that
> >> > provide basically no feedback from the compiler / interpreter / jitter,
> >> > this is inevitably true at the moment.  I foresee a future, however,
> >> > where better tools let us use the full power of a dynamic python AND let
> >> > us tighten up our code for speed to get the full advantages of jit
> >> > compilation as well.  I believe that in the end, this combination will
> >> > prove superior to any fully static compiler.
> >> >
> >> > -Terrence
> >> >
> >> > > Sarvi
> >> > >
> >> > >
> >> > > ----- Original Message ----
> >> > >
> >> > > > From: Terrence Cole <list-sink at trainedmonkeystudios.org>
> >> > > > To: pypy-dev at codespeak.net
> >> > > > Sent: Sun, September 26, 2010 2:28:12 PM
> >> > > > Subject: Re: [pypy-dev] Question on the future of RPython
> >> > > >
> >> > > > On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> >> > > > > i just had a  (probably) silly idea. :)
> >> > > > >
> >> > > > > if some people like rpython so much,  how about writing a rpython
> >> > > > > interpreter in rpython? wouldn't it be much  easier for the jit to
> >> > > > > optimize rpython code? couldn't jitted rpython  code theoretically be
> >> > > > > as fast as a program that got compiled to c from  rpython?
> >> > > > >
> >> > > > > hm... but i wonder if this would make sense at all.  maybe if you ran
> >> > > > > rpython code with pypy-c-jit, it already could be  jitted as well as
> >> > > > > with a special rpython interpreter? ...if there were a  special
> >> > > > > rpython interpreter, would the current jit generator have to be
> >> > > > > changed to take advantage of the more simple language?
> >> > > >
> >> > > > An  excellent question at least.
> >> > > >
> >> > > > A better idea, I think, would be to  ask what subset of full-python
> >> > > > will jit well.  What I'd really like to  see is a static analyzer that
> >> > > > can display (e.g. by coloring names or lines)  how "jit friendly" a
> >> > > > piece of python code is.  This would allow a  programmer to get an
> >> > > > idea of what help the jit is going to be when running  their code and,
> >> > > > hopefully, help people avoid tragic performance  results.  Naturally,
> >> > > > for performance intensive code, you would still  need to profile, but
> >> > > > for a lot of uses, simply not having catastrophically  bad performance
> >> > > > is more than enough for a good user experience.
> >> > > >
> >> > > > With such a tool, it wouldn't really matter if the answer to "what  is
> >> > > > faster" is RPython -- it would be whatever python language  subset
> >> > > > happens to work well in a particular case.  I've started working  on
> >> > > > something like this [1], but given that I'm doing a startup, I  don't
> >> > > > have nearly the time I would need to make this useful in the
> >> > > > near-term.
> >>
> >> The JIT works because it has more information at runtime than what is
> >> available at compile time. If the information was available at compile time we
> >> could do the optimizations then and not have to invoke the extra complexity
> >> required by the JIT. Examples of  the extra information include things like
> >> knowing that introspection will not be used in the current evaluation of a
> >> loop, specific argument types will be used in calls and that some arguments
> >> will be known to be constant over part of the program execution.. Knowing
> >> these bits allows you to optimize away large chunks o f the code that
> >> otherwise would have been executed.
> >>
> >> Static analysis assumes that none of the above mentioned possibilities can
> >> actually take place. It is impossible to make such assumptions at compile time
> >> in a dynamic language. Therefore PyPy is a bad match for people wanting to
> >> staically compile subsets of Python. Applying the JIT to RPython code
> >
> > Yes, that idea is just dumb.  It's also not what I suggested at all.  I
> > can see now that what I said would be easy to misinterpret, but on
> > re-reading it, it clearly doesn't say what you think it does.
> >
> >>  is not
> >> workable, because the JIT is optimized to remove bits of generated assembler
> >> code that never shows up in the compilation of RPython code.
> >>
> >> These are very basic first principle concepts, and it is a mystery to me why
> >> people can't work them out for themselves.
> >
> > You are quite right that static analysis will be able to do little to
> > help an optimal jit.  However, I doubt that in the near term pypy's jit
> > will cover all the dark corners of python equally well -- C has been
> > around for 38 years and its still got room for optimization.
> >
> > -Terrence
> >
> >> Jacob Hallén
> >
> >
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> 
> Hey.
> 
> I'm really interested in having jit feedback displayed as text info
> (say for profiling purposes). Do you have any particular ideas in mind
> or just a general one?

Lots.  They're almost all probably wrong though, so be warned :-).  I'm
also not entirely clear on what you mean, so let me tell you what I have
in mind and you can tell me if I'm way off base.

I assume workflow would go like this:  1) run pypy on a bunch of code in
profiling mode, 2) pypy spits out lots of data about what happened in
the jit when the program exits, 3) start up external analysis program
pointing it at this data, 4) browse the python source with the data from
the jit overlayed as color, formatting, etc on top of the source.
Potentially there would be several separate modes for viewing different
aspects of the jit info.  This could also include the ability to select
different program elements (loops, variables, functions, etc) and get
detailed information about their runtime usage in a side-pane.  Ideally,
this workflow would be taken care of automatically by pushing the run
button in your IDE.

As a more specific example of what the gui would do in, for instance,
escape analysis mode:  display local variables that do not escape any
loops in green, others in red.  Hovering over a red variable would show
information about how, why, and where it escapes the loop in a tooltip
or bubble.  Selecting a red variable show the same info in a pane and
would draw arrows on the source showing where it escapes from a
loop/function etc.

In my ideal world, this profiling data analysis would sit side-by-side
with various display modes that show useful static analysis feedback,
all inside a full-fledged python IDE.

This is all, of course, a long way off still.  What I'm working on right
now is basic linting for python3 so that I can add a lint step to our
hudson server and start to get some graphs up.

What I _really_ would like to work on, if I had the time, is making pypy
support Python3 so that I could use it at work.  However, I think I'd
mostly just get in the way if I tried that, given my other time
commitments.

I hope there was something helpful in that brain-dump, but I suspect I
may be way off target at this point.

-Terrence