[Python-ideas] A General Outline for Just-in-Time Acceleration of Python
Joseph Martinot-Lagarde
joseph.martinot-lagarde at m4x.org
Sat Jun 14 08:54:00 CEST 2014
Le 13/06/2014 08:21, David Mertz a écrit :
> There are many people with a better knowledge of PyPy than me; I've only
> looked at it and read some general white papers from the team. But I do
> know with certainty that PyPy executes ALL Python code, not a restricted
> subset[*]. You might be thinking of Cython in terms of an "annotated,
> restricted, version of Python."
Cython compiles all python, it is not restricted.
On the other hand, pypy is not compatible with the C API of CPython,
thus can not run compiled modules as is.
>
> However, PyPy itself is *written* in a restricted subset called RPython,
> which also might be what you are thinking of. Well, much of PyPy is
> written that way, I'm pretty sure some of it is regular unrestricted
> Python too. Other than the fact that PyPy isn't named, e.g. "PyRPy"
> this is just an implementation detail though--Jython is written in Java,
> Iron Python is written in C#, CPython is written in C, and PyPy is
> written in (R)Python. They all run user programs the same though[**]
>
> [*] Yes, possibly there is some weird buggy corner case where it does
> the wrong thing, but if so, file a ticket to get it fixed.
>
> [**] For a suitably fuzzy meaning of "the same"--obviously performance
> characteristics are going to differ, as are things like access to
> external library calls, etc. It's the same in terms of taking the
> same identical source files to run, in any case.
>
>
> On Thu, Jun 12, 2014 at 10:53 PM, Neil Girdhar
> <mistersheik at gmail.com
> <mailto:mistersheik at gmail.com>> wrote:
>
> Well that's great to hear :) I thought pypy only worked on a
> restricted set of Python. Does pypy save the optimization
> statistics between runs?
>
>
> On Fri, Jun 13, 2014 at 1:52 AM, David Mertz
> <mertz at gnosis.cx
> <mailto:mertz at gnosis.cx>> wrote:
>
> Other a sprinkling of the word "stochastic" around this post
> (why that word, not the more obvious "random"?), this basically
> exactly describes what PyPy does.
>
>
> On Thu, Jun 12, 2014 at 9:07 PM, Neil Girdhar
> <mistersheik at gmail.com
> <mailto:mistersheik at gmail.com>> wrote:
>
> I was wondering what work is being done on Python to make it
> faster. I understand that cpython is incrementally
> improved. I'm not sure, but I think that pypy acceleration
> works by compiling a restricted set of Python. And I think
> I heard something about Guido working on a different model
> for accelerating Python. I apologize in advance that I
> didn't look into these projects in a lot of detail. My
> number one dream about computer languages is for me to be
> able to write in a language as easy as Python and have it
> run as quickly as if it were written. I do believe that
> this is possible (since in theory someone could look at my
> Python code and port it to C++).
>
> Unfortunately, I don't have time to work on this goal, but I
> still wanted to get feedback about some ideas I have about
> reaching this goal.
>
> First, I don't think it's important for a "code block" (say,
> a small section of code with less coupling to statements
> outside the block than to within the block) to run quickly
> on its first iteration.
>
> What I'm suggesting instead is for every iteration of a
> "code block", the runtime stochastically decides whether to
> collect statistics about that iteration. Those statistics
> include the the time running the block, the time perform
> attribute accesses including type method lookups and so on.
> Basically, the runtime is trying to guess the potential
> savings of optimizing this block.
>
> If the block is run many times and the potential savings are
> large, then stochastically again, the block is promoted to a
> second-level statistics collection. This level collects
> statistics about all of the external couplings of the block,
> like the types and values of the passed-in and returned values.
>
> Using the second-level statistics, the runtime can now guess
> whether the block should be promoted to a third level
> whereby any consistencies are exploited. For example, if
> the passed-in parameter types and return value type of the
> "min" function are (int, int, int) for 40% of the statistics
> and (float, float, float) for 59%, and other types for the
> remaining 1%, then two precompiled versions of min are
> generated: one for int and one for float.
>
> These precompiled code blocks have different costs than
> regular Python blocks. They need to pay the following costs:
> * a check for the required invariants (parameter types
> above, but it could be parameter values, or other invariants)
> * they need to install hooks on objects that must remain
> invariant during the execution of the block; if the
> invariants are ever violated during the execution of the
> block, then all of the computations done during this
> execution of the block must be discarded
> * therefore a third cost is the probability of discarded the
> computation times the average cost of the doing the wasted
> computation.
>
> The saving is that the code block
> * can be transformed into a faster bytecode, which includes
> straight assembly instructions in some sections since types
> or values can now be assumed,
> * can use data structures that make type or access
> assumptions (for example a list that always contains ints
> can use a flattened representation; a large set that is
> repeatedly having membership checked with many negative
> results might benefit from an auxiliary bloom filter, etc.)
>
> In summary the runtime performs stochastic, incremental
> promotion of code blocks from first-level, to second-level,
> to multiple precompiled versions. It can also demote a code
> block. The difference between the costs of the different
> levels is statistically estimated.
>
> Examples of optimizations that can be commonly accomplished
> using such a system are:
> * global variables are folded into code as constants. (Even
> if they change rarely, you pay the discarding penalty
> described above plus the recompilation cost; the benefit of
> inline use of the constant (and any constant folding) might
> outweigh these costs.)
> * lookup of member functions, which almost never change
> * flattening of homogeneously-typed lists
>
> Best,
>
> Neil
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> <mailto:Python-ideas at python.org>
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons. Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
>
>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons. Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com
More information about the Python-ideas
mailing list