[Python-ideas] A General Outline for Just-in-Time Acceleration of Python

Joseph Martinot-Lagarde joseph.martinot-lagarde at m4x.org
Sat Jun 14 08:54:00 CEST 2014


Le 13/06/2014 08:21, David Mertz a écrit :
> There are many people with a better knowledge of PyPy than me; I've only
> looked at it and read some general white papers from the team.  But I do
> know with certainty that PyPy executes ALL Python code, not a restricted
> subset[*].  You might be thinking of Cython in terms of an "annotated,
> restricted, version of Python."

Cython compiles all python, it is not restricted.

On the other hand, pypy is not compatible with the C API of CPython, 
thus can not run compiled modules as is.

>
> However, PyPy itself is *written* in a restricted subset called RPython,
> which also might be what you are thinking of.  Well, much of PyPy is
> written that way, I'm pretty sure some of it is regular unrestricted
> Python too.  Other than the fact that PyPy isn't named, e.g. "PyRPy"
> this is just an implementation detail though--Jython is written in Java,
> Iron Python is written in C#, CPython is written in C, and PyPy is
> written in (R)Python.  They all run user programs the same though[**]
>
> [*] Yes, possibly there is some weird buggy corner case where it does
> the wrong thing, but if so, file a ticket to get it fixed.
>
> [**] For a suitably fuzzy meaning of "the same"--obviously performance
> characteristics are going to differ, as are things like access to
> external library calls, etc.  It's the same in terms of taking the
> same identical source files to run, in any case.
>
>
> On Thu, Jun 12, 2014 at 10:53 PM, Neil Girdhar
> <mistersheik at gmail.com
> <mailto:mistersheik at gmail.com>> wrote:
>
>     Well that's great to hear :)   I thought pypy only worked on a
>     restricted set of Python.  Does pypy save the optimization
>     statistics between runs?
>
>
>     On Fri, Jun 13, 2014 at 1:52 AM, David Mertz
>     <mertz at gnosis.cx
>     <mailto:mertz at gnosis.cx>> wrote:
>
>         Other a sprinkling of the word "stochastic" around this post
>         (why that word, not the more obvious "random"?), this basically
>         exactly describes what PyPy does.
>
>
>         On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar
>         <mistersheik at gmail.com
>         <mailto:mistersheik at gmail.com>> wrote:
>
>             I was wondering what work is being done on Python to make it
>             faster.  I understand that cpython is incrementally
>             improved.  I'm not sure, but I think that pypy acceleration
>             works by compiling a restricted set of Python.  And I think
>             I heard something about Guido working on a different model
>             for accelerating Python.  I apologize in advance that I
>             didn't look into these projects in a lot of detail.  My
>             number one dream about computer languages is for me to be
>             able to write in a language as easy as Python and have it
>             run as quickly as if it were written.  I do believe that
>             this is possible (since in theory someone could look at my
>             Python code and port it to C++).
>
>             Unfortunately, I don't have time to work on this goal, but I
>             still wanted to get feedback about some ideas I have about
>             reaching this goal.
>
>             First, I don't think it's important for a "code block" (say,
>             a small section of code with less coupling to statements
>             outside the block than to within the block)   to run quickly
>             on its first iteration.
>
>             What I'm suggesting instead is for every iteration of a
>             "code block", the runtime stochastically decides whether to
>             collect statistics about that iteration.  Those statistics
>             include the the time running the block, the time perform
>             attribute accesses including type method lookups and so on.
>               Basically, the runtime is trying to guess the potential
>             savings of optimizing this block.
>
>             If the block is run many times and the potential savings are
>             large, then stochastically again, the block is promoted to a
>             second-level statistics collection.   This level collects
>             statistics about all of the external couplings of the block,
>             like the types and values of the passed-in and returned values.
>
>             Using the second-level statistics, the runtime can now guess
>             whether the block should be promoted to a third level
>             whereby any consistencies are exploited.  For example, if
>             the passed-in parameter types and return value type of the
>             "min" function are (int, int, int) for 40% of the statistics
>             and (float, float, float) for 59%, and other types for the
>             remaining 1%, then two precompiled versions of min are
>             generated: one for int and one for float.
>
>             These precompiled code blocks have different costs than
>             regular Python blocks.  They need to pay the following costs:
>             * a check for the required invariants (parameter types
>             above, but it could be parameter values, or other invariants)
>             * they need to install hooks on objects that must remain
>             invariant during the execution of the block; if the
>             invariants are ever violated during the execution of the
>             block, then all of the computations done during this
>             execution of the block must be discarded
>             * therefore a third cost is the probability of discarded the
>             computation times the average cost of the doing the wasted
>             computation.
>
>             The saving is that the code block
>             * can be transformed into a faster bytecode, which includes
>             straight assembly instructions in some sections since types
>             or values can now be assumed,
>             * can use data structures that make type or access
>             assumptions (for example a list that always contains ints
>             can use a flattened representation; a large set that is
>             repeatedly having membership checked with many negative
>             results might benefit from an auxiliary bloom filter, etc.)
>
>             In summary the runtime performs stochastic, incremental
>             promotion of code blocks from first-level, to second-level,
>             to multiple precompiled versions.  It can also demote a code
>             block. The difference between the costs of the different
>             levels is statistically estimated.
>
>             Examples of optimizations that can be commonly accomplished
>             using such a system are:
>             * global variables are folded into code as constants.  (Even
>             if they change rarely, you pay the discarding penalty
>             described above plus the recompilation cost; the benefit of
>             inline use of the constant (and any constant folding) might
>             outweigh these costs.)
>             * lookup of member functions, which almost never change
>             * flattening of homogeneously-typed lists
>
>             Best,
>
>             Neil
>
>             _______________________________________________
>             Python-ideas mailing list
>             Python-ideas at python.org
>             <mailto:Python-ideas at python.org>
>             https://mail.python.org/mailman/listinfo/python-ideas
>             Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
>
>         --
>         Keeping medicines from the bloodstreams of the sick; food
>         from the bellies of the hungry; books from the hands of the
>         uneducated; technology from the underdeveloped; and putting
>         advocates of freedom in prisons.  Intellectual property is
>         to the 21st century what the slave trade was to the 16th.
>
>
>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com




More information about the Python-ideas mailing list