[Closing the loop for future contextual searches]

 

This solved the problem.  Moving all Twisted reactor-related imports inside overloaded multiprocessing.Process.run() functions, allows a single controller process to manage many Twisted reactors running in child processes. 

 

Thanks again.

-Chris

 

 

From: Twisted-Python <twisted-python-bounces@twistedmatrix.com> On Behalf Of chris@cmsconstruct.com
Sent: Saturday, September 12, 2020 12:07 PM
To: 'Twisted general discussion' <twisted-python@twistedmatrix.com>
Subject: Re: [Twisted-Python] doWrite on twisted.internet.tcp.Port

 

Hi Jean-Paul,

 

Thank you very much for the detailed answer.  And my appologies for not providing OS details; I’ve tested on CentOS and RedHat EL variants, not FreeBSD as the ticket discussed.  Looks like Red Hat (EL 7.6) is using epoll reactor, and the Windows side is using the select reactor.

 

Thanks for the direction on checking out sys.modules.  To avoid the reactor being loaded in the parent process, I can presumably move twisted imports within the multiprocessing child modules (from top, down into the run() functions).  I will see how far I need to go (e.g. if I can continue using Twisted’s JSON logging or if absolutely everything should be isolated until after child process startup).  But knowing I need to head that direction for epoll or other potential reactor conflicts - is very helpful.

 

Reminds me of the GI Joe cartoon in the early 1980’s that would end with, “knowing is half the battle.”

 

-Chris

 

 

From: Twisted-Python <twisted-python-bounces@twistedmatrix.com> On Behalf Of Jean-Paul Calderone
Sent: Friday, September 11, 2020 1:28 PM
To: Twisted general discussion <twisted-python@twistedmatrix.com>
Subject: Re: [Twisted-Python] doWrite on twisted.internet.tcp.Port

 

On Fri, Sep 11, 2020 at 1:34 PM <chris@cmsconstruct.com> wrote:

Hey guys,

 

Last year I hit a condition discussed in this ticket: https://twistedmatrix.com/trac/ticket/4759 for doWrite called on a twisted.internet.tcp.Port. 

 

I ignored it at the time since it was just on Linux, and my main platform was Windows.  Now I’m coming back to it.  I’ll add context on the problem below, but first I want to ask a high-level, design-type question with multiprocessing and Twisted:

 

Referencing Jean-Paul’s comment at the end of ticket 4759, I read you shouldn’t fork a process (multiprocessing module) that already has a Twisted reactor.  Understood.  But what about a parent process (not doing anything Twisted) forking child processes, where each child process starts their own Twisted reactor?  Is that intended to work from the Twisted perspective?

 

To answer the asked question, I don't think there is rigorous (or even casual) testing of very much of Twisted in the context of "some Twisted code has been loaded into memory and then the process forked".  So while it seems like a reasonable thing, I wouldn't say there's currently much effort being put towards making it a supported usage of Twisted.  Of course this can change at almost any moment if someone decides to commit the effort.

 

To dig a bit further into the specific problem, even if you only import the reactor in the parent process and then fork a child and try to start the reactor in the child, I strongly suspect epollreactor will break.  This is because the epoll object is created by reactor instantiation (as opposed to being delayed until the reactor is run).  epoll objects have a lot of weird behavior.  See the Questions and Answers section of the epoll(7) man page.

 

I don't know if this is the cause of your particular expression of these symptoms (it certainly doesn't apply to the original bug report which is on FreeBSD where there is no epoll) but it's at least a possible cause.

 

This could probably be fixed in Twisted by only creating the epoll object when run is called.  There's nothing particularly difficult about that change but it does involve touching a lot of the book-keeping logic since that all assumes it can register file descriptors before the reactor is started (think reactor.listenTCP(...); reactor.run()).

 

I'm not sure but it may also be the case that only delaying creation of the waker until the reactor starts would also fix this.  This is because as long as the epoll object remains empty a lot of the weird behavior is avoided and the waker is probably the only thing that actually gets added to it if you're just importing the reactor but not running it before forking.

 

Alternatively, your application should be able to fix it by studiously avoiding the import of twisted.internet.reactor (directly or transitively, of course).  You could add some kind of assertion about the state of sys.modules immediately before your forking code to develop some confidence you've managed this.

 

And if this is really an epoll problem then switching to poll or select reactor would also presumably get rid of the issue.

 

Jean-Paul