Re: [Python-Dev] Sandboxed Threads in Python
At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
Okay, basic principal first. You start with a sandboxed thread that has access to nothing. No modules, no builtins, *nothing*. This means it can run without the GIL but it can't do any work.
It sure can't. You need at least the threadstate and a builtins dictionary to do any work.
To make it do something useful we need to give it two things: first, immutable types that can be safely accessed without locks,
This is harder than it sounds. Integers, for example, have a custom allocator and a free list, not to mention a small-integer cache. You would somehow need to duplicate all that for each sandbox, or else you have to make those integers immortal using your "magic constant".
Turns out it's quite easy and it doesn't harm performance of existing code or require modification (but a recompile is necessary). The idea is to only use a cyclic garbage collector for cleaning them up,
Um, no, actually. You need a mark-and-sweep GC or something of that ilk. Python's GC only works with objects that *have refcounts*, and it works by clearing objects that are in cycles. The clearing causes DECREF-ing, which then causes objects to be freed. If you have objects without refcounts, they would be immortal and utterly unrecoverable.
which means we need to disable the reference counting. That requires we modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a magic constant (probably a negative value).
And any object with the magic refcount will live *forever*, unless you manually deallocate it.
That's all it takes. Modify Py_INCREF and Py_DECREFs to check for a magic constant. Ahh, but the performance? See for yourself.
First, you need to implement a garbage collection scheme that can deal with not having refcounts. Otherwise you're not comparing apples to apples here, and your programs will leak like crazy. Note that implementing a root-based GC for Python is non-trivial, since extension modules can store pointers to PyObjects anywhere they like. Further, many Python objects don't even support being tracked by the current cycle collector. So, changing this would probably require a lot of C extensions to be rewritten to support the needed API changes for the new garbage collection strategy.
So to sum up, by prohibiting mutable objects from being transferred between sandboxes we can achieve scalability on multiple CPUs, making threaded programming easier and more reliable, as a bonus get secure sandboxes[1], and do that all while maintaining single-threaded performance and requiring minimal changes to existing C modules (recompiling).
Unfortunately, you have only succeeded in restating the problem, not reducing its complexity. :) In fact, you may have increased the complexity, since now you need a threadsafe garbage collector, too. Oh, and don't forget - newstyle classes keep weak references to all their subclasses, which means for example that every time you subclass 'dict', you're modifying the "immutable" 'dict' class. So, unless you recreate all the classes in each sandbox, you're back to needing locking. And if you recreate everything in each sandbox, well, I think you've just reinvented "processes". :)
Phillip J. Eby wrote:
Oh, and don't forget - newstyle classes keep weak references to all their subclasses, which means for example that every time you subclass 'dict', you're modifying the "immutable" 'dict' class. So, unless you recreate all the classes in each sandbox, you're back to needing locking. And if you recreate everything in each sandbox, well, I think you've just reinvented "processes". :)
After all, there's a reason Bruce Eckel's recent post about multi-processing attracted a fair amount of interest. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com
On 10/7/05, Phillip J. Eby <pje@telecommunity.com> wrote:
At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
Okay, basic principal first. You start with a sandboxed thread that has access to nothing. No modules, no builtins, *nothing*. This means it can run without the GIL but it can't do any work.
It sure can't. You need at least the threadstate and a builtins dictionary to do any work.
To make it do something useful we need to give it two things: first, immutable types that can be safely accessed without locks,
This is harder than it sounds. Integers, for example, have a custom allocator and a free list, not to mention a small-integer cache. You would somehow need to duplicate all that for each sandbox, or else you have to make those integers immortal using your "magic constant".
Yes, we'd probably want some per-sandbox allocators. I'm no expert on that but I know it can be done.
Turns out it's quite easy and it doesn't harm performance of existing code or require modification (but a recompile is necessary). The idea is to only use a cyclic garbage collector for cleaning them up,
Um, no, actually. You need a mark-and-sweep GC or something of that ilk. Python's GC only works with objects that *have refcounts*, and it works by clearing objects that are in cycles. The clearing causes DECREF-ing, which then causes objects to be freed. If you have objects without refcounts, they would be immortal and utterly unrecoverable.
Perhaps I wasn't clear enough, I was assuming appropriate changes to the GC would be done. The important thing is it can be done without changing the interface that the existing modules use.
which means we need to disable the reference counting. That requires we modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a magic constant (probably a negative value).
And any object with the magic refcount will live *forever*, unless you manually deallocate it.
See above.
That's all it takes. Modify Py_INCREF and Py_DECREFs to check for a magic constant. Ahh, but the performance? See for yourself.
First, you need to implement a garbage collection scheme that can deal with not having refcounts. Otherwise you're not comparing apples to apples here, and your programs will leak like crazy.
Note that implementing a root-based GC for Python is non-trivial, since extension modules can store pointers to PyObjects anywhere they like. Further, many Python objects don't even support being tracked by the current cycle collector.
So, changing this would probably require a lot of C extensions to be rewritten to support the needed API changes for the new garbage collection strategy.
They only need to be rewritten if you want them to provide an immutable type that can be transferred between sandboxes. Short of that you can make the module object itself immutable, and from it create mutable instances that are private to each sandbox and not sharable. If you make no changes at all the module still works, but is only usable from the main thread. That allows us to transition incrementally.
So to sum up, by prohibiting mutable objects from being transferred between sandboxes we can achieve scalability on multiple CPUs, making threaded programming easier and more reliable, as a bonus get secure sandboxes[1], and do that all while maintaining single-threaded performance and requiring minimal changes to existing C modules (recompiling).
Unfortunately, you have only succeeded in restating the problem, not reducing its complexity. :) In fact, you may have increased the complexity, since now you need a threadsafe garbage collector, too.
Oh, and don't forget - newstyle classes keep weak references to all their subclasses, which means for example that every time you subclass 'dict', you're modifying the "immutable" 'dict' class. So, unless you recreate all the classes in each sandbox, you're back to needing locking. And if you recreate everything in each sandbox, well, I think you've just reinvented "processes". :)
I was aware that weakrefs needed some special handling (I just forgot to mention it), but I didn't know it was used by subclassing. Unfortunately I don't know what purpose it serves so I can't contemplate how to deal with it. I need to stress that *only* the new, immutable and "thread-safe mark-and-sweep" types would be affected by these changes. Everything else would continue to exist as it did before, and the benchmark exists to show they can coexist without killing performance. -- Adam Olsen, aka Rhamphoryncus
Adam Olsen <rhamph@gmail.com> wrote:
I need to stress that *only* the new, immutable and "thread-safe mark-and-sweep" types would be affected by these changes. Everything else would continue to exist as it did before, and the benchmark exists to show they can coexist without killing performance.
All the benchmark showed was that checking for a constant in the refcount during in/decrefing, and not garbage collecting those objects, didn't adversely affect performance. As an aside, there's also the ugly bit about being able to guarantee that an object is immutable. I personally mutate Python strings in my C code all the time (long story, not to be discussed here), and if I can do it now, then any malicious or "inventive" person can do the same in this "sandboxed thread" Python "of the future". At least in the case of integers, one could work the tagged integer idea to bypass the freelist issue the Phillip offered, but in general, I don't believe there exists a truely immutable type as long as there is C extensions and/or cTypes. Further, the work to actually implement a new garbage collector for Python in order to handle these 'immutable' types seems to me to be more trouble than it is worth. - Josiah
On 10/7/05, Josiah Carlson <jcarlson@uci.edu> wrote:
Adam Olsen <rhamph@gmail.com> wrote:
I need to stress that *only* the new, immutable and "thread-safe mark-and-sweep" types would be affected by these changes. Everything else would continue to exist as it did before, and the benchmark exists to show they can coexist without killing performance.
All the benchmark showed was that checking for a constant in the refcount during in/decrefing, and not garbage collecting those objects, didn't adversely affect performance.
As an aside, there's also the ugly bit about being able to guarantee that an object is immutable. I personally mutate Python strings in my C code all the time (long story, not to be discussed here), and if I can do it now, then any malicious or "inventive" person can do the same in this "sandboxed thread" Python "of the future".
Malicious use is hardly a serious concern. Someone using C code could just as well crash the interpreter. Modifying a python string you just created before you expose it to python code should be fine. If that's not what you're doing.. I'm not sure I want to know *wink*
At least in the case of integers, one could work the tagged integer idea to bypass the freelist issue the Phillip offered, but in general, I don't believe there exists a truely immutable type as long as there is C extensions and/or cTypes. Further, the work to actually implement a new garbage collector for Python in order to handle these 'immutable' types seems to me to be more trouble than it is worth.
Maybe.. I'm not convinced. There's a lot of payback IMO. -- Adam Olsen, aka Rhamphoryncus
Adam Olsen <rhamph@gmail.com> wrote:
On 10/7/05, Josiah Carlson <jcarlson@uci.edu> wrote:
Adam Olsen <rhamph@gmail.com> wrote:
I need to stress that *only* the new, immutable and "thread-safe mark-and-sweep" types would be affected by these changes. Everything else would continue to exist as it did before, and the benchmark exists to show they can coexist without killing performance.
All the benchmark showed was that checking for a constant in the refcount during in/decrefing, and not garbage collecting those objects, didn't adversely affect performance.
As an aside, there's also the ugly bit about being able to guarantee that an object is immutable. I personally mutate Python strings in my C code all the time (long story, not to be discussed here), and if I can do it now, then any malicious or "inventive" person can do the same in this "sandboxed thread" Python "of the future".
Malicious use is hardly a serious concern. Someone using C code could just as well crash the interpreter.
Your malicious user is my inventive colleague. Here's one: performing zero-copy inter-thread IPC by modifying shared immutables. Attempting to enforce a policy of "don't do that, it's not supported" is not going to be effective, especially when doing unsupported things increase speed. People have known for decades that having anything run in kernel space beyond the kernel is dangerous, but they still do because it is faster. I can (but won't) point out examples for days of bad decisions made for the sake of speed, or policy that has been ignored for the sake of speed (some of these overlap and some don't).
Modifying a python string you just created before you expose it to python code should be fine. If that's not what you're doing.. I'm not sure I want to know *wink*
You really don't want to know.
Maybe.. I'm not convinced. There's a lot of payback IMO.
You've not convinced me either. Good luck in getting a group together to make it happen. - Josiah
On 10/8/05, Josiah Carlson <jcarlson@uci.edu> wrote:
Your malicious user is my inventive colleague. Here's one: performing zero-copy inter-thread IPC by modifying shared immutables. Attempting to enforce a policy of "don't do that, it's not supported" is not going to be effective, especially when doing unsupported things increase speed.
I actually have no problem with that, so long as you use a custom type. It may not technically be immutable but it does provide its own clearly defined semantics for simultaneous modification, and that's enough. Anyway, the idea as I presented it is dead at this point, so I'll leave it at that. -- Adam Olsen, aka Rhamphoryncus
I can (but won't) point out examples for days of bad decisions made for the sake of speed, or policy that has been ignored for the sake of speed (some of these overlap and some don't).
As long as you've entered premature-optimization land, how about decisions made because it's *assumed* that (A) We must have speed here and (B) This will make it happen. My hope would be that we could find a solution that would by default keep you out of trouble when writing concurrent programs, but provide a back door if you wanted to do something special. If you choose to go in the back door, you have to do it consciously and take responsibility for the outcome. With Java, in contrast, as soon as you step into the world of concurrency (even if you step in by accident, which is not uncommon), lots of rules change. What was an ordinary method call before is now something risky that can cause great damage. Should I make this variable volatile? Is an operation atomic? You have to learn a lot of things all over again. I don't want that for Python. I'd like the move into concurrency to be a gentle slope, not a sudden reality-shift. If a novice decides they want to try game programming with concurrency, I want there to be training wheels on by default, so that their first experience will be a successful one, and they can then start learning more features and ideas incrementally, without trying a feature and suddenly having the whole thing get weird and crash down on their heads and cause them to run screaming away ... I know there have been some technologies that have already been mentioned on this list and I hope that we can continue to experiment with and discuss those and also new ideas until we shake out the fundamental issues and maybe even come up with a list of possible solutions. Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234@mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar
Bruce Eckel <BruceEckel-Python3234@mailblocks.com> wrote:
I can (but won't) point out examples for days of bad decisions made for the sake of speed, or policy that has been ignored for the sake of speed (some of these overlap and some don't).
As long as you've entered premature-optimization land, how about decisions made because it's *assumed* that (A) We must have speed here and (B) This will make it happen.
A. From what I understand about sandboxing threads, the point was to remove the necessity for the GIL, so that every thread can go out on its own and run on its own processor. B. Shared memory vs. queues vs. pipes vs. ... Concurrency without communication is almost totally worthless. Historically, shared memory has tended to be one of the fastest (if not the fastest) communication methods available. Whether or not mutable shared memory would be faster or slower than queues is unknown, but I'm going to stick with my experience until I am proved wrong by this mythical free threaded system with immutables.
My hope would be that we could find a solution that would by default keep you out of trouble when writing concurrent programs, but provide a back door if you wanted to do something special. If you choose to go in the back door, you have to do it consciously and take responsibility for the outcome.
With Java, in contrast, as soon as you step into the world of concurrency (even if you step in by accident, which is not uncommon), lots of rules change. What was an ordinary method call before is now something risky that can cause great damage. Should I make this variable volatile? Is an operation atomic? You have to learn a lot of things all over again.
I don't want that for Python. I'd like the move into concurrency to be a gentle slope, not a sudden reality-shift. If a novice decides they want to try game programming with concurrency, I want there to be training wheels on by default, so that their first experience will be a successful one, and they can then start learning more features and ideas incrementally, without trying a feature and suddenly having the whole thing get weird and crash down on their heads and cause them to run screaming away ...
I don't want to get into an argument here. While I agree that concurrent programming should be easier, my experience with MPI (and other similar systems) and writing parallel algorithms leads me to believe that even if you have a simple method for communication, even if you can guarantee that thread/process A won't clobber thread/process B, actually writing software which executes in some way which made the effort of making the software concurrent worthwhile, is less than easy. I'd love to be proved wrong (I'm hoping to do it myself with tuple spaces). I do, however, doubt that free threading approaches will be the future for concurrent programming in CPython. - Josiah
Josiah Carlson wrote:
I do, however, doubt that free threading approaches will be the future for concurrent programming in CPython.
Hear, hear! IMO, it's the combination of the GIL with a compiler which never decides to change the code execution order under the covers that makes threading *not* a pain in Python (so long as one remembers to release the GIL around blocking calls to external libraries, and to use threading.Queue to get info between threads wherever possible). The desire to change that seems to be a classic case of wanting to write C/C++/Java/whatever in Python, rather than writing Python in Python. And thanks to Bruce for starting the recent multi-processing discussion - hopefully one day we will have mechanisms in the standard library that scale relatively smoothly from PEP 342 based logical threads, through threading.Thread based physical threads, to <something> based subprocesses. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com
participants (5)
-
Adam Olsen -
Bruce Eckel -
Josiah Carlson -
Nick Coghlan -
Phillip J. Eby