
I would like to direct your attention to a presentation that was given by Mr. Herb Sutter on the subject of concurrency support in programming languages. It's a long video, but I would urge you to watch at least the first five minutes of it: http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter Mr. Sutter is the secretary of the ISO/C++ standards committee. I mention this only so that you'll understand he's not just some random yo-yo with an opinion. (There are a number of other papers by Mr. Sutter on the same topic which you can google for, however none of them are as compelling or comprehensive as the video presentation IMHO.) The gist of the introduction is this (my words, not his): Starting from now, and for the rest of your career as a programmer, you better start programming for massively concurrent systems or you are doomed to irrelevance. As he points out in the talk, the good news is that Moore's law is going to continue for several decades to come. The bad news is that most people don't understand Moore's law or what they get from it. Moore's law is really about the number of transistors on a chip. It says nothing about processor speed or performance. He calls it "the end of the free lunch", meaning that in the past you could write a program, wait a little while, and it would run faster - because the hardware got faster while you were waiting. This will no longer be true for single-threaded programs. Scalar processors, with all of their pipelining, prefetching and other tricks, have gotten about as fast as they are ever going to get; We've reached the point of diminishing returns. So the only thing left to do is throw more and more cores on the chip. There's no physical reason why the number of cores won't continue to double every 18 months; the only reason why manufacturers might choose not to make 128-core CPUs by the year 2012 is that there won't be enough software to take advantages of that degree of parallelism. (Even with today's transistor count, you could put 100 Pentium-class processors on a single die.) Moreover, programming languages which have built-in support for scaling to many processors are going to win in the long run, and programming languages that don't have strong, built-in concurrency support will gradually fade away. The reason for this is simple: People don't like using applications which don't get faster when they buy a better computer. Now, on the server side, all of this is a solved problem. We have a very robust model for handling hundreds of requests in parallel, and a very strong model for dealing with shared state. This model is called "transactions" or more formally ACID. On the client side, however, things are much worse. As he puts it, locks are the best we have, and locks suck. Locks are not "logically composable", in the sense that you can't simply take two pieces of software that were written independently and glue them together unless they share a common locking architecture. This ruins a large part of the power of modern software, which is the ability to compose software out of smaller, independently-written components. On the client side, you have heterogeneous components accessing shared state at high bandwidth with myriads of pointer-based references to that shared state. Add concurrency, and what you end up with is a system in which it is impossible to make logical inferences about the behavior of the system. (And before you ask - yes, functional languages - and their drawbacks - are addressed in the talk. The same is true for 'lock-free programming' and other techniques du jour.) Now, in the talk he proposes some strategies for building concurrency into the language so that client-side programming can be done in a way that isn't rocket science. We won't be able to get rid of locks, he claims, but with the right language support at least we'll be able to reason about them, and maybe push them off to the side so that they mostly stay in their corner. In any case, what I would like to open up is a discussion of how these ideas might influence the design of the Python language. One thing that is important to understand is that I'm not talking about "automatic parallelization" where the compiler automatically figures out what parts can be parallelized. That would require so much change to the Python language as to make it no longer Python. Nor am I talking about manually creating "thread" objects, and having to carefully place protections around any shared state. That kind of parallelism is too low-level, too coarse-grained, and too hard to do, even for people who think they know how to write concurrent code. I am not even necessarily talking about changing the Python language (although certainly the internals of the interpreter will need to be changed.) The same language can be used to describe the same kinds of problems and operations, but the implications of those language elements will change. This is analogous to the fact that these massively multicore CPUs 10 years from now will most likely be using the same instruction sets as today - but that does not mean that programming as a discipline will anything like what it is now. As an example of what I mean, suppose the Python 'with' statement could be used to indicate an atomic transaction. Any operations that were done within that block would either be committed or rolled back. This is not a new idea - here's an interesting paper on it: http://www-sal.cs.uiuc.edu/~zilles/tm.html I'm sure that there's a lot more like that out there. However, there is also a lot of stuff out there that is *superficially* similar to what I am talking about, and I want to make the distinction clear. For example, any discussion of concurrency in Python will naturally raise the topic of both IPython and Stackless. However, IPython (from what I understand) is really about distributed computing and not so much about fine-grained concurrency; And Stackless (from what I understand) is really about coroutines or continuations, which is a different kind of concurrency. Unless I am mistaken (and I very well could be) neither of these are immediately applicable to the problem of authoring Python programs for multi-core CPUs, but I think that both of them contain valuable ideas that are worth discussing. Now, I will be the first to admit that I am not an expert in these matters. Don't let my poor, naive ideas be a limit to what is discussed. What I've primarily tried to do in this posting is to get your attention and convince you that this is important and worth talking about. And I hope to be able to learn something along the way. My overall goal here is to be able to continue writing programs in Python 10 years from now, not just as a hobby, but as part of my professional work. If Python is able to leverage the power of the CPUs that are being created at that time, I will be able to make a strong case for doing so. On the other hand, if I have a 128-core CPU on my desk, and Python is only able to utilize 1/128th of that power without resorting to tedious calculations of race conditions and deadlocks, then its likely that my Python programming will be relegated to the role of a hobby. -- Talin