Re: [Python-Dev] Making python C-API thread safe (try 2)
Phillip J. Eby wrote:
At 08:47 PM 9/11/03 +0300, Harri Pesonen wrote:
But my basic message is this: Python needs to be made thread safe. Making the individual interpreters thread safe is trivial, and benefits many people, and is a necessary first step;
It's far from trivial - you're talking about invalidating every piece of C code written for Python over a multi-year people by dozens upon dozens of extension authors.
The change is trivial in Python C API. I already said that it would break everything outside the Python distribution, but the change in other applications is also trivial.
It doesn't benefit many people: only those using isolated interpreters embedded in a multithreaded C program.
I don't know how many people are writing threads in Python, either. I guess that not so many. In my case I only need a thread safe interpreter, I don't create threads in Python code. So just having what I described would be enough for me: no need for global interpreter lock, and Python would be really multithreading. It would benefit many people, I'm sure.
making threads within interpreter thread safe is possible as well, at least if you leave something for the developer, as you should, as you do in every other programming language as well.
You misunderstand. Those "critical sections" are for the most part in Python's C code, not in the Python script.
Yes, I'm aware of the None problem at least (only one instance of it). Please enlighten me about the other critical sections? Object allocation/freeing?
I'm guessing you haven't done much writing of C extensions for Python (or Python core C), or else you'd realize why trying to make INCREF/DECREF threadsafe would absolutely decimate performance. Reference count updates happen *way* too often in normal code flow.
I also knew that already. But how else can you do it? Of course, changing Python to not have a single None would help a lot. Or, perhaps it could have a single None, but in case of None, the reference count would have no meaning, it would never be deallocated, because it would be checked in code. Maybe it does it already, I don't know. I'm also wondering why this problem has not been addressed before? If I had the power to change Python, this would be the first thing I did. Harri
On Thu, 2003-09-11 at 15:16, Harri Pesonen wrote:
I'm also wondering why this problem has not been addressed before? If I had the power to change Python, this would be the first thing I did.
Try coming up with a patch. I expect it would be considered provided that it was maximally backwards compatible with the existing C API and did not reduce performance on benchmarks like pystone. That is to say, sure it would be nice, but at what cost? Jeremy
Jeremy Hylton wrote:
On Thu, 2003-09-11 at 15:16, Harri Pesonen wrote:
I'm also wondering why this problem has not been addressed before? If I had the power to change Python, this would be the first thing I did.
Try coming up with a patch. I expect it would be considered provided that it was maximally backwards compatible with the existing C API and did not reduce performance on benchmarks like pystone.
It would be totally incompatible with the existing C API. The performance would be better.
That is to say, sure it would be nice, but at what cost?
At cost of breaking up with the past, but eventually it has to be done. If I had time, I would create this multithreading Python myself, call it a different language perhaps. I don't know if the Python licence allows this. Harri
Harri Pesonen wrote:
If I had time, I would create this multithreading Python myself, call it a different language perhaps. I don't know if the Python licence allows this.
The license is only on the implementation (source code), not on the language itself. You are free to provide alternative implementations of the language, without having to ask anybody for permission. But please don't call such a thing trivial when you know it is inherently complex. Regards, Martin
Harri Pesonen
The change is trivial in Python C API. I already said that it would break everything outside the Python distribution, but the change in other applications is also trivial.
If it is trivial, would you mind posting a patch somewhere?
You misunderstand. Those "critical sections" are for the most part in Python's C code, not in the Python script.
Yes, I'm aware of the None problem at least (only one instance of it). Please enlighten me about the other critical sections? Object allocation/freeing?
Yes, that, plus: - allocation of/access to small numbers - access to global variables in extension modules (e.g. cursesmodule.c:PyCursesError) - type objects etc.
Of course, changing Python to not have a single None would help a lot. Or, perhaps it could have a single None, but in case of None, the reference count would have no meaning, it would never be deallocated, because it would be checked in code. Maybe it does it already, I don't know.
So how can you know a patch would be trivial?
I'm also wondering why this problem has not been addressed before? If I had the power to change Python, this would be the first thing I did.
Please go ahead and post a patch. You might find that the patch is difficult to write, and, once written, will have many errors. Once those are fixed, you might find that Python becomes painfully slow, and lose a lot of portability. Regards, Martin
Yes, I'm aware of the None problem at least (only one instance of it). Please enlighten me about the other critical sections? Object allocation/freeing?
None is only the molecule at the tip of the iceberg. If you're talking about threads within a Python interpreter, doing just about anything at all to any Python object is a critical section. The amount of locking required to deal with that at a fine-grained level would totally kill performance, not to mention being hugely tedious and error-prone to implement. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
I used to work on the SISAL compiler which was a fine-grained parallel functional language. It reference counted its allocated data. Updating the reference count is a critical section. We were spending 40% of our time blocked for reference counting on only 4 threads. We eventually taught the compiler to do reference count optimizations (e.g. put all readers of an object before the writers), but these optimizations are not suitable to the Python interpreter. With this same reasoning, you can infer why C++ smart pointers are also a bad idea for threaded code. Pat -- Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller All you need in this life is ignorance and confidence, and then success is sure. -- Mark Twain
participants (6)
-
"Martin v. Löwis"
-
Greg Ewing
-
Harri Pesonen
-
Jeremy Hylton
-
martin@v.loewis.de
-
Pat Miller