python simply not scaleable enough for google?

Thu Nov 12 12:33:36 EST 2009

mcherm <mcherm at gmail.com> writes:

> On Nov 11, 7:38 pm, Vincent Manis <vma... at telus.net> wrote:
>> 1. The statement `Python is slow' doesn't make any sense to me.
>> Python is a programming language; it is implementations that have
>> speed or lack thereof.
>    [...]
>> 2. A skilled programmer could build an implementation that compiled
>> Python code into Common Lisp or Scheme code, and then used a
>> high-performance Common Lisp compiler...
>
> I think you have a fundamental misunderstanding of the reasons why
> Python is
> slow. Most of the slowness does NOT come from poor implementations:
> the CPython
> implementation is extremely well-optimized; the Jython and Iron Python
> implementations use best-in-the-world JIT runtimes. Most of the speed
> issues
> come from fundamental features of the LANGUAGE itself, mostly ways in
> which
> it is highly dynamic.
>
> In Python, a piece of code like this:
>     len(x)
> needs to watch out for the following:
>     * Perhaps x is a list OR
>       * Perhaps x is a dict OR
>       * Perhaps x is a user-defined type that declares a __len__
> method OR
>       * Perhaps a superclass of x declares __len__ OR
>     * Perhaps we are running the built-in len() function OR
>       * Perhaps there is a global variable 'len' which shadows the
> built-in OR
>       * Perhaps there is a local variable 'len' which shadows the
> built-in OR
>       * Perhaps someone has modified __builtins__
>
> In Python it is possible for other code, outside your module to go in
> and
> modify or replace some methods from your module (a feature called
> "monkey-patching" which is SOMETIMES useful for certain kinds of
> testing).
> There are just so many things that can be dynamic (even if 99% of the
> time
> they are NOT dynamic) that there is very little that the compiler can
> assume.
>
> So whether you implement it in C, compile to CLR bytecode, or
> translate into
> Lisp, the computer is still going to have to to a whole bunch of
> lookups to
> make certain that there isn't some monkey business going on, rather
> than
> simply reading a single memory location that contains the length of
> the list.
> Brett Cannon's thesis is an example: he attempted desperate measures
> to
> perform some inferences that would allow performing these
> optimizations
> safely and, although a few of them could work in special cases, most
> of the
> hoped-for improvements were impossible because of the dynamic nature
> of the
> language.
>
> I have seen a number of attempts to address this, either by placing
> some
> restrictions on the dynamic nature of the code (but that would change
> the
> nature of the Python language) or by having some sort of a JIT
> optimize the
> common path where we don't monkey around. Unladen Swallow and PyPy are
> two
> such efforts that I find particularly promising.
>
> But it isn't NEARLY as simple as you make it out to be.
>
> -- Michael Chermside

You might be right for the wrong reasons in a way.

Python isn't slow because it's a dynamic language.  All the lookups
you're citing are highly optimized hash lookups.  It executes really
fast.

The OP is talking about scale.  Some people say Python is slow at a
certain scale.  I say that's about true for any language.  Large amounts
of IO is a tough problem.

Where Python might get hit *as a language* is that the Python programmer
has to drop into C to implement optimized data-structures for dealing
with the kind of IO that would slow down the Python interpreter.  That's
why we have numpy, scipy, etc.  The special cases it takes to solve
problems with custom types wasn't special enough to alter the language.
Scale is a special case believe it or not.

As an implementation though, the sky really is the limit and Python is
only getting started.  Give it another 40 years and it'll probably
realize that it's just another Lisp. ;)