python simply not scaleable enough for google?

Fri Nov 13 01:20:11 EST 2009

When I was approximately 5, everybody knew that higher level languages were too slow for high-speed numeric computation (I actually didn't know that then, I was too busy watching Bill and Ben the Flowerpot Men), and therefore assembly languages were mandatory. Then IBM developed Fortran, and higher-level languages were not too slow for numeric computation. 

When I was in university, IBM released a perfectly horrible implementation of PL/I, which dynamically allocated and freed stack frames for each procedure entry and exit (`Do Not Use Procedures: They Are Inefficient': section heading from the IBM PL/I (G) Programmer's Guide, circa 1968). Everyone knew PL/I was an abomination of a language, which could not be implemented efficiently. Then MIT/Bell Labs/GE/Honeywell wrote Multics in a PL/I subset, and (eventually) it ran quite efficiently. 

When Bell Labs pulled out of the Multics effort, some of their researchers wrote the first version of Unix in assembly language, but a few years later rewrote the kernel in C. Their paper reporting this included a sentence that said in effect, `yes, the C version is bigger and slower than the assembler version, but it has more functionality, so C isn't so bad'. Everybody knew that high-level languages were too inefficient to write an operating system in (in spite of the fact that Los Alamos had already written an OS in a Fortran dialect). Nobody knew that at about that time, IBM had started writing new OS modules in a company-confidential PL/I subset. 

When I was in grad school, everybody knew that an absolute defence to a student project running slowly was `I wrote it in Lisp'; we only had a Lisp interpreter running on our system. We didn't have MacLisp, which had been demonstrated to compile carefully-written numerical programs into code that ran more efficiently than comparable programs compiled by DEC's PDP-10 Fortran compiler in optimizing mode. 

In an earlier post, I mentioned SBCL and Chez Scheme, highly optimizing compiler-based implementations of Common Lisp and Scheme, respectively. I don't have numbers for SBCL, but I know that (again with carefully-written Scheme code) Chez Scheme can produce code that runs in the same order of magnitude as optimized C code. These are both very old systems that, at least in the case of Chez Scheme, use techniques that have been reported in the academic literature. My point in the earlier post about translating Python into Common Lisp or Scheme was essentially saying `look, there's more than 30 years experience building high-performance implementations of Lisp languages, and Python isn't really that different from Lisp, so we ought to be able to do it too'. 

All of which leads me to summarize the current state of things. 

1. Current Python implementations may or may not be performance-scalable in ways we need. 

2. Reorganized interpreters may give us a substantial improvement in performance. More significant improvements would require a JIT compiler, and there are good projects such as Unladen Swallow that may well deliver a substantial improvement. 

3. We might also get improvements from good use of Python 3 annotations, or other pragma style constructs that might be added to the language after the moratorium, which would give a compiler additional information about the programmer's intent. (For example, Scheme has a set of functions that essentially allow a programmer to say, `I am doing integer arithmetic with values that are limited in range to what can be stored in a machine word'.) These annotations wouldn't destroy the dynamic nature of Python, because they are purely optional. This type of language feature would allow a programmer to exploit the high-performance compilation technologies that are common in the Lisp world. 

Even though points (2) and (3) between them offer a great deal of hope for future Python implementations, there is much that can be done with our current implementations. Just ask the programmer who writes a loop that laboriously does what could be done much more quickly with a list comprehension or with map. 

-- v