python simply not scaleable enough for google?

mcherm mcherm at gmail.com
Thu Nov 12 16:07:23 CET 2009


On Nov 11, 7:38 pm, Vincent Manis <vma... at telus.net> wrote:
> 1. The statement `Python is slow' doesn't make any sense to me.
> Python is a programming language; it is implementations that have
> speed or lack thereof.
   [...]
> 2. A skilled programmer could build an implementation that compiled
> Python code into Common Lisp or Scheme code, and then used a
> high-performance Common Lisp compiler...

I think you have a fundamental misunderstanding of the reasons why
Python is
slow. Most of the slowness does NOT come from poor implementations:
the CPython
implementation is extremely well-optimized; the Jython and Iron Python
implementations use best-in-the-world JIT runtimes. Most of the speed
issues
come from fundamental features of the LANGUAGE itself, mostly ways in
which
it is highly dynamic.

In Python, a piece of code like this:
    len(x)
needs to watch out for the following:
    * Perhaps x is a list OR
      * Perhaps x is a dict OR
      * Perhaps x is a user-defined type that declares a __len__
method OR
      * Perhaps a superclass of x declares __len__ OR
    * Perhaps we are running the built-in len() function OR
      * Perhaps there is a global variable 'len' which shadows the
built-in OR
      * Perhaps there is a local variable 'len' which shadows the
built-in OR
      * Perhaps someone has modified __builtins__

In Python it is possible for other code, outside your module to go in
and
modify or replace some methods from your module (a feature called
"monkey-patching" which is SOMETIMES useful for certain kinds of
testing).
There are just so many things that can be dynamic (even if 99% of the
time
they are NOT dynamic) that there is very little that the compiler can
assume.

So whether you implement it in C, compile to CLR bytecode, or
translate into
Lisp, the computer is still going to have to to a whole bunch of
lookups to
make certain that there isn't some monkey business going on, rather
than
simply reading a single memory location that contains the length of
the list.
Brett Cannon's thesis is an example: he attempted desperate measures
to
perform some inferences that would allow performing these
optimizations
safely and, although a few of them could work in special cases, most
of the
hoped-for improvements were impossible because of the dynamic nature
of the
language.

I have seen a number of attempts to address this, either by placing
some
restrictions on the dynamic nature of the code (but that would change
the
nature of the Python language) or by having some sort of a JIT
optimize the
common path where we don't monkey around. Unladen Swallow and PyPy are
two
such efforts that I find particularly promising.

But it isn't NEARLY as simple as you make it out to be.

-- Michael Chermside



More information about the Python-list mailing list