[Twisted-Python] epoll keep sharing state between process even after fork.
Hi everybody I a came across a surprising problem when using the epoll based reactor. (ticket here https://twistedmatrix.com/trac/ticket/6796) As you can see on the ticket, the epoll object seems to share some state even after fork. Which means even after having forked the process, making some changes to the inherited epoll object in one process may impact the one existing in another process ! This problem is only related to epoll based reactor, poll and select behave correctly. I'm aware that some may say that this in not a twisted related problem (but an epoll issue), but I'm addressing my question here to try to figure out what would be the best workaround for this problem in a twisted based project ? And for information this is what I have already tryed: - I though about using poll or select reactor, but this is not possible, cause as i'm writing a library I do not decide which reactor will be installed by the person who want to use it. (Besides as epoll is now the default reactor installed, I certainly want my library to work with it !) - Using spawnProcess as mentioned in the ticket comment, is not an option too, in my case, I need to share some state between the main process and subprocess. (I have on object in main process space that I want to inherit in all subprocess) - I (desperately) tryed to re-initialize the reactor._poller object after each fork to set it with a new object, but it was just a very bad idea ! :) Thank you, in advance, for any possible clue ! (for information, my project is here https://github.com/Grindizer/scaletix)
On 23/10/13 16:46, Flint wrote:
Hi everybody
I a came across a surprising problem when using the epoll based reactor. (ticket here https://twistedmatrix.com/trac/ticket/6796)
As you can see on the ticket, the epoll object seems to share some state even after fork. Which means even after having forked the process, making some changes to the inherited epoll object in one process may impact the one existing in another process !
This problem is only related to epoll based reactor, poll and select behave correctly. I'm aware that some may say that this in not a twisted related problem (but an epoll issue), but I'm addressing my question here to try to figure out what would be the best workaround for this problem in a twisted based project ?
The problem is that you're sharing the epoll object (and thus the underlying epoll FD, and associated state) with the multiprocessing child processes. Either: 1. Don't use multiprocessing 2. Arrange for the epoll object (or FD) to be closed after fork, but before exec, so that the child process can't fiddle with it 3. Create the epoll object after starting the multiprocessing children IMHO Twisted and multiprocessing are not good together.
On 23/10/13 17:39, Phil Mayers wrote:
2. Arrange for the epoll object (or FD) to be closed after fork, but before exec, so that the child process can't fiddle with it
See also: http://bugs.python.org/issue8713 ...which suggests Python 3.4 added fork+exec support to multiprocessing. On Unix and older Python versions, you're stuck with plain fork and all the attendant horribleness. This is a multiprocessing bug IMHO.
On 10/23/2013 12:50 PM, Phil Mayers wrote:
This is a multiprocessing bug IMHO.
This issue with multiprocessing appears in other places too. E.g. if you're using stdlib logging, child processes will try to rotate the parent process logs. Basically multiprocessing on Unix is utterly broken and should never be used (except in the fork+exec form in Python 3.4). -Itamar
On 11:19 am, itamar@itamarst.org wrote:
On 10/23/2013 12:50 PM, Phil Mayers wrote:
This is a multiprocessing bug IMHO.
This issue with multiprocessing appears in other places too. E.g. if you're using stdlib logging, child processes will try to rotate the parent process logs.
Basically multiprocessing on Unix is utterly broken and should never be used (except in the fork+exec form in Python 3.4).
To expand on that just a bit, the form of sharing that you get when you fork() but you don't exec() is very difficult to use correctly (I think it's an open question whether it's *possible* to use correctly in a Python program). The argument here is similar to the argument against shared-everything multithreading. While memory (and some other per-process state) is no longer shared after fork(), *some* per-process state is still shared. And all of the state that isn't shared is still a potential source of bugs since it's almost certainly the case that none of it cooperated with the fork() call - a call which happened at some arbitrary time and captured a snapshot of all the state in memory at an arbitrary point. Consider a simple implementation of a lock file, used to prevent multiple instances of a program from starting. There are several ways fork() could break such code. Perhaps it is partway through acquiring a lock on the lock file when the fork() occurs. Perhaps the result is that the file ends up locked but no process thinks it is holding the lock. Now no instances of the program are running. Or perhaps the lock is held when fork() happens and the problem only surfaces at unlock time. Perhaps one of the processes exits and releases the lock. Now the program is still running but the lock isn't held. And that's just one of the simplest possible examples of how things can go wrong. The nearly uncountable different ways for failures to creep in and the resulting impracticality (if not impossibility) of being able to test that Twisted (or any Python library) actually works when fork() is used means that it's not likely Twisted will ever be declared compatible with any fork()-without-exec() usage. You can find some examples of Twisted-using applications that run multiple processes, though. Apple CalendarServer does it by passing file descriptors to worker processes and sends them the location of a configuration file describing how they should behave. Divmod Mantissa does it by inserting self-describing work into a SQLite3 database. When the worker process finds one of these, it knows what code to load and run by looking at the fields in the row. These are variations on a theme - RPC, not shared (or duplicated) memory. Hope this helps, Jean-Paul
To expand on that just a bit, the form of sharing that you get when you fork() but you don't exec() is very difficult to use correctly (I think it's an open question whether it's *possible* to use correctly in a Python program).
The argument here is similar to the argument against shared-everything multithreading. While memory (and some other per-process state) is no longer shared after fork(), *some* per-process state is still shared. And all of the state that isn't shared is still a potential source of bugs since it's almost certainly the case that none of it cooperated with the fork() call - a call which happened at some arbitrary time and captured a snapshot of all the state in memory at an arbitrary point.
Consider a simple implementation of a lock file, used to prevent multiple instances of a program from starting. There are several ways fork() could break such code. Perhaps it is partway through acquiring a lock on the lock file when the fork() occurs. Perhaps the result is that the file ends up locked but no process thinks it is holding the lock. Now no instances of the program are running. Or perhaps the lock is held when fork() happens and the problem only surfaces at unlock time. Perhaps one of the processes exits and releases the lock. Now the program is still running but the lock isn't held.
And that's just one of the simplest possible examples of how things can go wrong.
The nearly uncountable different ways for failures to creep in and the resulting impracticality (if not impossibility) of being able to test that Twisted (or any Python library) actually works when fork() is used means that it's not likely Twisted will ever be declared compatible with any fork()-without-exec() usage.
You can find some examples of Twisted-using applications that run multiple processes, though. Apple CalendarServer does it by passing file descriptors to worker processes and sends them the location of a configuration file describing how they should behave. Divmod Mantissa does it by inserting self-describing work into a SQLite3 database. When the worker process finds one of these, it knows what code to load and run by looking at the fields in the row. These are variations on a theme - RPC, not shared (or duplicated) memory.
Hope this helps, Jean-Paul
Thankx a lot. I'll probably rethink everything in my project hehe, but i'm glad I asked !
Hi, this is funny, I am also working in this area currently. Though I use spawnProcess for various reasons, not multiprocessing - which is also the recommendation on the ticket. And it makes sense. Nevertheless I'd be interested what happens if you try that on a kqueue-reactor OS .. ideally FreeBSD. OSX kqueue isn't the greatest. /Tobias Von: twisted-python-bounces@twistedmatrix.com [mailto:twisted-python-bounces@twistedmatrix.com] Im Auftrag von Flint Gesendet: Mittwoch, 23. Oktober 2013 17:46 An: twisted-python@twistedmatrix.com Betreff: [Twisted-Python] epoll keep sharing state between process even after fork. Hi everybody I a came across a surprising problem when using the epoll based reactor. (ticket here https://twistedmatrix.com/trac/ticket/6796) As you can see on the ticket, the epoll object seems to share some state even after fork. Which means even after having forked the process, making some changes to the inherited epoll object in one process may impact the one existing in another process ! This problem is only related to epoll based reactor, poll and select behave correctly. I'm aware that some may say that this in not a twisted related problem (but an epoll issue), but I'm addressing my question here to try to figure out what would be the best workaround for this problem in a twisted based project ? And for information this is what I have already tryed: - I though about using poll or select reactor, but this is not possible, cause as i'm writing a library I do not decide which reactor will be installed by the person who want to use it. (Besides as epoll is now the default reactor installed, I certainly want my library to work with it !) - Using spawnProcess as mentioned in the ticket comment, is not an option too, in my case, I need to share some state between the main process and subprocess. (I have on object in main process space that I want to inherit in all subprocess) - I (desperately) tryed to re-initialize the reactor._poller object after each fork to set it with a new object, but it was just a very bad idea ! :) Thank you, in advance, for any possible clue ! (for information, my project is here https://github.com/Grindizer/scaletix)
On 03:46 pm, grindizer@gmail.com wrote:
Hi everybody
I a came across a surprising problem when using the epoll based reactor. (ticket here https://twistedmatrix.com/trac/ticket/6796)
As you can see on the ticket, the epoll object seems to share some state even after fork.
[snip]
- Using spawnProcess as mentioned in the ticket comment, is not an option too, in my case, I need to share some state between the main process and subprocess. (I have on object in main process space that I want to inherit in all subprocess)
This doesn't really explain why you can't use `spawnProcess`. There are other ways to share state between processes. Perhaps if you describe the object you have someone can suggest a way to share it that will still satisfy your requirements without requiring that you use the `multiprocessing` module. Jean-Paul
participants (5)
-
exarkun@twistedmatrix.com
-
Flint
-
Itamar Turner-Trauring
-
Phil Mayers
-
Tobias Oberstein