[Twisted-Python] multiprocessing capability?
Hi, I looked at the source for the threads module and wondered if the current Twisted supports Python's (2.6) multiprocessing threading? If not, is there a stable package somewhere that patches Twisted to support this? I saw one from last summer but not sure if its stable. Thanks! Darren
On Sun, Feb 21, 2010 at 1:43 PM, Darren Govoni <darren@ontrenet.com> wrote:
Hi, I looked at the source for the threads module and wondered if the current Twisted supports Python's (2.6) multiprocessing threading? If not, is there a stable package somewhere that patches Twisted to support this? I saw one from last summer but not sure if its stable.
Hi Darren, I don't think there's any explicit support for multiprocessing, although I have seen some people using multiprocessing to run twisted in multiple processes. This doesn't answer your question, but you might be interested in ampoule as this provides a nice process protocol implemented on twisted or specifically twisted.protocols.amp: https://launchpad.net/ampoule -Drew
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB. (I think AMP is more robust than PB, though) The project that uses the children IO and process protocols is Lunch. See http://svn.sat.qc.ca/trac/lunch a 2010/2/21 Drew Smathers <drew.smathers@gmail.com>:
On Sun, Feb 21, 2010 at 1:43 PM, Darren Govoni <darren@ontrenet.com> wrote:
Hi, I looked at the source for the threads module and wondered if the current Twisted supports Python's (2.6) multiprocessing threading? If not, is there a stable package somewhere that patches Twisted to support this? I saw one from last summer but not sure if its stable.
Hi Darren, I don't think there's any explicit support for multiprocessing, although I have seen some people using multiprocessing to run twisted in multiple processes. This doesn't answer your question, but you might be interested in ampoule as this provides a nice process protocol implemented on twisted or specifically twisted.protocols.amp: https://launchpad.net/ampoule -Drew
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Alexandre Quessy http://alexandre.quessy.net/
On Feb 21, 2010, at 8:00 PM, Alexandre Quessy wrote:
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB.
Using a higher-level protocol is generally better, if for no other reason that it gives you a framework within which to document your design decisions. It's much easier to say "An AMP command with a 'foo' String argument and a 'bar' Integer argument" than to say "The first two bytes of the message are the length of the first argument. The next n bytes are the first argument. The first argument shall be interpreted as... (etc, etc)"
(I think AMP is more robust than PB, though)
Why do you say that? I think AMP is simpler than PB, but PB works pretty well if you need its functionality.
Glyph Lefkowitz wrote:
On Feb 21, 2010, at 8:00 PM, Alexandre Quessy wrote
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB.
Using a higher-level protocol is generally better, if for no other reason that it gives you a framework within which to document your design decisions. It's much easier to say "An AMP command with a 'foo' String argument and a 'bar' Integer argument" than to say "The first two bytes of the message are the length of the first argument. The next n bytes are the first argument. The first argument shall be interpreted as... (etc, etc)" I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option. --rich
Looks interesting. I'm going to check out that package. My original request was more along the lines of using Python's new support for native CPU core's and processes (the multiprocessing package is for this). Python's built-in thread support has global lock constraints that underperform in some situations. But I ran into a problem using multiprocessing module with Twisted that was pointed out on the Twisted trac with pickling class methods and apparently Python's CPU threading support attempts to do this in some situations (e.g. when I try to pass a class method to a native thread). On Wed, 2010-02-24 at 12:04 -0800, K. Richard Pixley wrote:
Glyph Lefkowitz wrote:
On Feb 21, 2010, at 8:00 PM, Alexandre Quessy wrote
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB.
Using a higher-level protocol is generally better, if for no other reason that it gives you a framework within which to document your design decisions. It's much easier to say "An AMP command with a 'foo' String argument and a 'bar' Integer argument" than to say "The first two bytes of the message are the length of the first argument. The next n bytes are the first argument. The first argument shall be interpreted as... (etc, etc)"
I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option.
--rich
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Single threaded, event loop based code like twisted rocks hard. Once upon a time, threads were like that too and the distinction between threads and event loops was grey. But with the advent of mandatory preemptive thread scheduling and the ability to run multiple threads on separate shared memory processors, the difference between programming with threads and programming with parallel heavy weight processes that share memory became extremely grey, (aside from the problems debugging threads which don't exist for heavy weight processes). Threads routinely use shared memory and shared memory (generally) requires a common kernel. OTOH, message passing can use a common kernel but can also extend out to other machines on the network. If you use twisted for highly efficient "single thread/multiple task" heavy weight processes, and something like spread, you end up with the best of all worlds. Highly efficient, symmetric, network based parallelism, with fault tolerance thrown in for free. My point here is that there are other ways to go about exploiting symmetric multiprocessor machines, even banks of them, that neither require threads, nor the multiprocessing package. --rich Darren Govoni wrote:
Looks interesting. I'm going to check out that package.
My original request was more along the lines of using Python's new support for native CPU core's and processes (the multiprocessing package is for this). Python's built-in thread support has global lock constraints that underperform in some situations.
But I ran into a problem using multiprocessing module with Twisted that was pointed out on the Twisted trac with pickling class methods and apparently Python's CPU threading support attempts to do this in some situations (e.g. when I try to pass a class method to a native thread).
On Wed, 2010-02-24 at 12:04 -0800, K. Richard Pixley wrote:
Glyph Lefkowitz wrote:
On Feb 21, 2010, at 8:00 PM, Alexandre Quessy wrote
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB.
Using a higher-level protocol is generally better, if for no other reason that it gives you a framework within which to document your design decisions. It's much easier to say "An AMP command with a 'foo' String argument and a 'bar' Integer argument" than to say "The first two bytes of the message are the length of the first argument. The next n bytes are the first argument. The first argument shall be interpreted as... (etc, etc)"
I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option.
--rich _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com <mailto:Twisted-Python@twistedmatrix.com> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
------------------------------------------------------------------------
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
The nice thing about using Python's process support is that you can spawn native Processes that run in separate heaps directly from ONE Python Twisted app. Not many running side-by-side, which adds the complexity of now coordinating among them (however easy with additional protocols like spread). Inter-Process communication is also supported in Python's new multiprocessing package. And again, it can all be orchestrated from a single service _instance_. In my code, I need to run "on the metal" for some tasks and not others. Agreed, the event-based reactor threading in Twisted is great. But not for all modes of computation. For those, I offload onto OS processes directly onto CPU cores. Twisted does not provide a way to leverage its API against Python's support for this feature. So I have to find a way to marry the two. What I ended up doing was using the multiprocess package to kick off hard Process objects in a Python Process pool executing Python functions. Those functions make calls into Twisted, but for it to work, they had to start their own reactors because a Process has its own, separate OS memory, etc. Running compute intensive tasks in processes with their own memory makes a lot of sense for some things that Python cannot do with virtual machine thread contexts. Darren On Thu, 2010-02-25 at 10:21 -0800, K. Richard Pixley wrote:
Single threaded, event loop based code like twisted rocks hard.
Once upon a time, threads were like that too and the distinction between threads and event loops was grey. But with the advent of mandatory preemptive thread scheduling and the ability to run multiple threads on separate shared memory processors, the difference between programming with threads and programming with parallel heavy weight processes that share memory became extremely grey, (aside from the problems debugging threads which don't exist for heavy weight processes).
Threads routinely use shared memory and shared memory (generally) requires a common kernel. OTOH, message passing can use a common kernel but can also extend out to other machines on the network. If you use twisted for highly efficient "single thread/multiple task" heavy weight processes, and something like spread, you end up with the best of all worlds. Highly efficient, symmetric, network based parallelism, with fault tolerance thrown in for free.
My point here is that there are other ways to go about exploiting symmetric multiprocessor machines, even banks of them, that neither require threads, nor the multiprocessing package.
--rich
Darren Govoni wrote:
Looks interesting. I'm going to check out that package.
My original request was more along the lines of using Python's new support for native CPU core's and processes (the multiprocessing package is for this). Python's built-in thread support has global lock constraints that underperform in some situations.
But I ran into a problem using multiprocessing module with Twisted that was pointed out on the Twisted trac with pickling class methods and apparently Python's CPU threading support attempts to do this in some situations (e.g. when I try to pass a class method to a native thread).
On Wed, 2010-02-24 at 12:04 -0800, K. Richard Pixley wrote:
Glyph Lefkowitz wrote:
On Feb 21, 2010, at 8:00 PM, Alexandre Quessy wrote
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB.
Using a higher-level protocol is generally better, if for no other reason that it gives you a framework within which to document your design decisions. It's much easier to say "An AMP command with a 'foo' String argument and a 'bar' Integer argument" than to say "The first two bytes of the message are the length of the first argument. The next n bytes are the first argument. The first argument shall be interpreted as... (etc, etc)"
I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option.
--rich
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
____________________________________________________________________
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
The nice thing about using Python's process support is that you can spawn native Processes that run in separate heaps directly from ONE Python Twisted app. Not many running side-by-side, which adds the complexity of now coordinating among them (however easy with additional protocols like spread).
Inter-Process communication is also supported in Python's new multiprocessing package. And again, it can all be orchestrated from a single service _instance_.
In my code, I need to run "on the metal" for some tasks and not others. Agreed, the event-based reactor threading in Twisted is great. But not for all modes of computation. For those, I offload onto OS processes directly onto CPU cores. Twisted does not provide a way to leverage its API against Python's support for this feature. But twisted provides this feature *itself*, and has done so long before
Darren Govoni wrote: the multiprocessing module existed - look at Process Protocol http://twistedmatrix.com/documents/current/core/howto/process.html and the stdio stuff: http://twistedmatrix.com/documents/current/core/examples/stdiodemo.py, or, as has been mentioned before, ampoule. Of course, if you get multiprocess to work with twisted, that's fine, but you're probably unnecessarily adding complexity to your application while substracting compatibility with python versions before 2.6 for no good reason, at least none you mentioned so far. regards, Johann
What you refer to is different than what I need. The real 'Process' implementation is new to Python 2.6 http://docs.python.org/library/multiprocessing.html and is not supported in Twisted at the moment. The Process or threads in Twisted now, use Python threading/process constructs outside of the new multiprocessing module, will suffer from the Python GIL limitations - which hinders higher performance computing. It works, sure. But its not what I'm asking about. However, I found a workaround for now. Ideally, what I wanted to do was use something like threads.deferToPool() or similar, that _would_ use Python's new support for OS processes. Someone wrote a wrapper to this on the net, but I was curious if this will be supported in Twisted. Darren On Fri, 2010-02-26 at 01:13 +0000, Johann Borck wrote:
The nice thing about using Python's process support is that you can spawn native Processes that run in separate heaps directly from ONE Python Twisted app. Not many running side-by-side, which adds the complexity of now coordinating among them (however easy with additional protocols like spread).
Inter-Process communication is also supported in Python's new multiprocessing package. And again, it can all be orchestrated from a single service _instance_.
In my code, I need to run "on the metal" for some tasks and not others. Agreed, the event-based reactor threading in Twisted is great. But not for all modes of computation. For those, I offload onto OS processes directly onto CPU cores. Twisted does not provide a way to leverage its API against Python's support for this feature. But twisted provides this feature *itself*, and has done so long before
Darren Govoni wrote: the multiprocessing module existed - look at Process Protocol http://twistedmatrix.com/documents/current/core/howto/process.html and the stdio stuff: http://twistedmatrix.com/documents/current/core/examples/stdiodemo.py, or, as has been mentioned before, ampoule. Of course, if you get multiprocess to work with twisted, that's fine, but you're probably unnecessarily adding complexity to your application while substracting compatibility with python versions before 2.6 for no good reason, at least none you mentioned so far.
regards, Johann
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On Thu, Feb 25, 2010 at 7:33 PM, Darren Govoni <darren@ontrenet.com> wrote:
What you refer to is different than what I need. The real 'Process' implementation is new to Python 2.6 http://docs.python.org/library/multiprocessing.html and is not supported in Twisted at the moment. The Process or threads in Twisted now, use Python threading/process constructs outside of the new multiprocessing module, will suffer from the Python GIL limitations - which hinders higher performance computing. It works, sure. But its not what I'm asking about.
I think you're a bit confused. You said "use Python threading/process constructs outside of the new multiprocessing module, will suffer from the Python GIL limitations". The "GIL limitations" *only* apply to threads, not processes. You can take advantage of multiple CPUs in Python by running multiple processes, no matter what technology you use to start those processes. -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/
Thank you for that clarification. On Thu, 2010-02-25 at 19:54 -0600, Christopher Armstrong wrote:
On Thu, Feb 25, 2010 at 7:33 PM, Darren Govoni <darren@ontrenet.com> wrote:
What you refer to is different than what I need. The real 'Process' implementation is new to Python 2.6 http://docs.python.org/library/multiprocessing.html and is not supported in Twisted at the moment. The Process or threads in Twisted now, use Python threading/process constructs outside of the new multiprocessing module, will suffer from the Python GIL limitations - which hinders higher performance computing. It works, sure. But its not what I'm asking about.
I think you're a bit confused. You said "use Python threading/process constructs outside of the new multiprocessing module, will suffer from the Python GIL limitations".
The "GIL limitations" *only* apply to threads, not processes. You can take advantage of multiple CPUs in Python by running multiple processes, no matter what technology you use to start those processes.
What you refer to is different than what I need. The real 'Process' implementation is new to Python 2.6 http://docs.python.org/library/multiprocessing.html and is not supported in Twisted at the moment. The Process or threads in Twisted now, use Python threading/process constructs outside of the new multiprocessing module, will suffer from the Python GIL limitations - which hinders higher performance computing. Hi Darren, Sometimes "that's wrong" is good news: Python Processes spawned using
Darren Govoni wrote: the twisted API twisted don't suffer any more from GIL limitations than any other Python Process. That's only true for threads. You seem to think the multiprocessing module does something that wasn't possible in Python before, but that's not the case, the twisted version just has a different API, but does essentially the same thing. regards, Johann
Hello ! 2010/2/25 Johann Borck <johann.borck@densedata.com>:
The nice thing about using Python's process support is that you can spawn native Processes that run in separate heaps directly from ONE Python Twisted app. Not many running side-by-side, which adds the complexity of now coordinating among them (however easy with additional protocols like spread).
Inter-Process communication is also supported in Python's new multiprocessing package. And again, it can all be orchestrated from a single service _instance_.
In my code, I need to run "on the metal" for some tasks and not others. Agreed, the event-based reactor threading in Twisted is great. But not for all modes of computation. For those, I offload onto OS processes directly onto CPU cores. Twisted does not provide a way to leverage its API against Python's support for this feature. But twisted provides this feature *itself*, and has done so long before
Darren Govoni wrote: the multiprocessing module existed - look at Process Protocol http://twistedmatrix.com/documents/current/core/howto/process.html and the stdio stuff: http://twistedmatrix.com/documents/current/core/examples/stdiodemo.py,
That's exactly what I have done with my application Lunch, and I am very satisfied with the results. It can be distributed on many hosts via SSH ! That comes free with standard IO and process management. You can get the source code from http://svn.sat.qc.ca/trac/lunch Cheers, Alexandre
or, as has been mentioned before, ampoule. Of course, if you get multiprocess to work with twisted, that's fine, but you're probably unnecessarily adding complexity to your application while substracting compatibility with python versions before 2.6 for no good reason, at least none you mentioned so far.
regards, Johann
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Alexandre Quessy http://alexandre.quessy.net/
Weird, I was looking into doing something similar for the Paxos algorithm (http://en.wikipedia.org/wiki/Paxos_algorithm), but I decided I didn't have the time right now. If you haven't, I recommend that you check out Paxos Made Live: http://labs.google.com/papers/paxos_made_live.html That paper has some nice details about google's experience implementing a production quality Paxos library - most importantly, don't let the simplicity of the algorithm mislead you into thinking that a real implementation will also be simple. I found it a little depressing... On Wed, Feb 24, 2010 at 2:04 PM, K. Richard Pixley <rich@noir.com> wrote:
Glyph Lefkowitz wrote:
On Feb 21, 2010, at 8:00 PM, Alexandre Quessy wrote
Hello everyone, I have done something similar to this, but I used the children IO stream to control them. Maybe I should have done that using some higher level protocol, such as AMP or PB.
Using a higher-level protocol is generally better, if for no other reason that it gives you a framework within which to document your design decisions. It's much easier to say "An AMP command with a 'foo' String argument and a 'bar' Integer argument" than to say "The first two bytes of the message are the length of the first argument. The next n bytes are the first argument. The first argument shall be interpreted as... (etc, etc)"
I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option.
--rich
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Mark Wright markscottwright@gmail.com
On Wed, Feb 24, 2010 at 3:04 PM, K. Richard Pixley <rich@noir.com> wrote:
I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option.
Are you working on the twisted's interface to Spread or on python-Spread interface like http://zope.org/Members/tim_one/spread ? Regards, -- Mikhail Terekhov
On Wed, Feb 24, 2010 at 3:04 PM, K. Richard Pixley <rich@noir.com> wrote:
I'm working on an interface right now to the spread toolkit, (http://spread.org), which implements virtual synchrony, (http://en.wikipedia.org/wiki/Virtual_synchrony).
For distributed, symmetric, fault tolerant parallelism in small to medium scale with high reliability, this might be an option Are you working on the twisted's interface to Spread or on python-Spread interface like http://zope.org/Members/tim_one/spread ? I'm using http://zope.org/Members/tim_one/spread (aka, the debian
Mikhail Terekhov wrote: package python-spread, aka the python package SpreadModule), and twisted to produce a new, extended interface to twisted. I'm not sure how long SpreadModule will work for me, but aside from some minor questions, (unicode group names, nonblocking IO, copying/subclassing it's message types), it's holding up so far. --rich
participants (9)
-
Alexandre Quessy
-
Christopher Armstrong
-
Darren Govoni
-
Drew Smathers
-
Glyph Lefkowitz
-
Johann Borck
-
K. Richard Pixley
-
Mark Wright
-
Mikhail Terekhov