Why is it impossible to create a compiler than can compile Python to machinecode like C?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Mon Mar 4 20:35:34 EST 2013
On Mon, 04 Mar 2013 16:36:36 +0000, Grant Edwards wrote:
> On 2013-02-28, kramer65 <kramerh at gmail.com> wrote:
>
>> I'm using Python for a while now and I love it. There is just one thing
>> I cannot understand. There are compilers for languages like C and C++.
>> why is it impossible to create a compiler that can compile Python code
>> to machinecode?
>
> The main issue is that python has dynamic typing. The type of object
> that is referenced by a particular name can vary, and there's no way (in
> general) to know at compile time what the type of object "foo" is.
>
> That makes generating object code to manipulate "foo" very difficult.
That's only a limitation with a static compiler that tries to generate
fast machine code at compile time. A JIT compiler can generate fast
machine code at runtime. The idea is that the time you save by running
more optimized code is greater than the time it costs to generate that
code each time you run. If not, then you just run the normal byte-code
you would have run, and you've lost very little.
This has been a very successful approach. Psyco worked very well, and
it's successor PyPy seems to work even better. PyPy, on average, runs
about 5-6 times faster than CPython, and for some tasks about 50 times
faster. That makes it broadly speaking as fast as Java and approaching C,
and even in some circumstances even beat static C compilers. (Admittedly
only under very restrictive circumstances.)
The downside is that JIT compilers need a lot of memory. To oversimplify,
you might take source code like this:
x = a + b
A static compiler knows what types a and b are, and can turn it into a
single fairly compact piece of machine code. Using my own crappy invented
machine code notation:
ADD a, b
STORE x
A JIT compiler has to generate a runtime check and one or more fast
branches, plus a fallback:
CASE (a and b are both ints):
ADD a, b
STORE x
CASE (a and b are both strings):
COPY a, x
CONCATENATE b, x
OTHERWISE:
execute byte code to add two arbitrary objects
The above is probably nothing like the way PyPy actually works. Anyone
interested in how PyPy really works should spend some time reading their
website and blog.
http://pypy.org/
I can especially recommend this to give you a flavour for just how
complicated this sort of thing can become:
http://morepypy.blogspot.com.au/2011/01/loop-invariant-code-motion.html
--
Steven
More information about the Python-list
mailing list