python simply not scaleable enough for google?
steve at REMOVE-THIS-cybersource.com.au
Fri Nov 13 03:50:33 CET 2009
On Thu, 12 Nov 2009 21:02:11 +0100, Alf P. Steinbach wrote:
> Specifically, I reacted to the statement that <<it is sheer nonsense to
> talk about "the" speed of an implementation>>, made in response to
> someone upthread, in the context of Google finding CPython overall too
> It is quite slow. ;-)
Quite slow to do what? Quite slow compared to what?
I think you'll find using CPython to sort a list of ten million integers
will be quite a bit faster than using bubblesort written in C, no matter
how efficient the C compiler.
And why are we limiting ourselves to integers representable by the native
C int? What if the items in the list were of the order of 2**100000? Of
if they were mixed integers, fractions, fixed-point decimals, and
floating-point binaries? How fast is your C code going to be now? That's
going to depend on the C library you use, isn't it? In other words, it is
an *implementation* issue, not a *language* issue.
Okay, let's keep it simple. Stick to numbers representable by native C
ints. Around this point, people start complaining that it's not fair, I'm
not comparing apples with apples. Why am I comparing a highly-optimized,
incredibly fast sort method in CPython with a lousy O(N**2) algorithm in
C? To make meaningful comparisons, you have to make sure the algorithms
are the same, so the two language implementations do the same amount of
work. (Funnily enough, it's "unfair" to play to Python's strengths, and
"fair" to play to C's strengths.)
Then people invariable try to compare (say) something in C involving low-
level bit-twiddling or pointer arithmetic with something in CPython
involving high-level object-oriented programming. Of course CPython is
"slow" if you use it to do hundreds of times more work in every operation
-- that's comparing apples with oranges again, but somehow people think
that's okay when your intention is to prove "Python is slow".
An apples-to-apples comparison would be to use a framework in C which
offered the equivalent features as Python: readable syntax ("executable
pseudo-code"), memory management, garbage disposal, high-level objects,
message passing, exception handling, dynamic strong typing, and no core
If you did that, you'd get something that runs much closer to the speed
of CPython, because that's exactly what CPython is: a framework written
in C that provides all those extra features.
(That's not to say that Python-like high-level languages can't, in
theory, be significantly faster than CPython, or that they can't have JIT
compilers that emit highly efficient -- in space or time -- machine code.
That's what Psyco does, now, and that's the aim of PyPy.)
However, there is one sense that Python *the language* is slower than
(say) C the language. Python requires that an implementation treat the
built-in function (say) int as an object subject to modification by the
caller, while C requires that it is a reserved word. So when a C compiler
sees "int", it can optimize the call to a known low-level routine, while
a Python compiler can't make this optimization. It *must* search the
entire scope looking for the first object called 'int' it finds, then
search the object's scope for a method called '__call__', then execute
that. That's the rules for Python, and an implementation that does
something else isn't Python. Even though the searching is highly
optimized, if you call int() one million times, any Python implementation
*must* perform that search one million times, which adds up. Merely
identifying what function to call is O(N) at runtime for Python and O(1)
at compile time for C.
Note though that JIT compilers like Psyco can often take shortcuts and
speed up code by a factor of 2, or up to 100 in the best cases, which
brings the combination of CPython + Psyco within shouting distance of the
speed of the machine code generated by good optimizing C compilers. Or
you can pass the work onto an optimized library or function call that
avoids the extra work. Like I said, there is no reason for Python
*applications* to be slow.
More information about the Python-list