[Twisted-Python] Windows spawnProcess - Child process inherits files and sockets (POSIX does not)
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
Hello all, I filed https://twistedmatrix.com/trac/ticket/7970 last summer while focused on a particular project that was hit by it. Since then I had to attend to other things and I'm now back to refocusing on that project. The subject says most of it, but the ticket has more information, including why this isn't an issue on Python >= 3.4 ( my project is on 2.7...) :/ Before submitting a patch for review, I'm looking for preliminary feedback, assuming you agree that the Windows vs POSIX semantics should be the same (if not, why?). My patch calls a few Windows APIs via ctypes, however, as far as I can tell, Twisted on Windows requires pywin32 and, recently, there has been some discussion around dropping that dependency and moving towards something based on cffi. What would you say the way forward is? Should I submit the patch for review anyway? Is there any other work that needs to be done first that I may contribute to? Thanks in advance for Twisted and for your feedback, -- exvito
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Feb 21, 2016, at 7:14 PM, exvito here <ex.vitorino@gmail.com> wrote:
Hello all,
I filed https://twistedmatrix.com/trac/ticket/7970 <https://twistedmatrix.com/trac/ticket/7970> last summer while focused on a particular project that was hit by it. Since then I had to attend to other things and I'm now back to refocusing on that project.
Thanks for following up. Sorry I've taken so long to respond; this is a highly complex issue that is tempting to just punt on thinking about. But we really should address it, and thank you for writing a great bug report.
Before submitting a patch for review, I'm looking for preliminary feedback, assuming you agree that the Windows vs POSIX semantics should be the same (if not, why?).
After much thought: Yes. They should be the same. The reason they're not is largely ignorance of the relevant APIs and abstractions on Windows, not any desire that they differ. The one place they have to differ a little bit is handle inheritance: we need to figure out some way to express the 'childFDs' mapping in terms of both file descriptors and handles.
My patch calls a few Windows APIs via ctypes, however, as far as I can tell, Twisted on Windows requires pywin32 and, recently, there has been some discussion around dropping that dependency and moving towards something based on cffi.
ctypes is dangerous and error-prone. If you screw up the bit-width of a type, or the type in a header changes on some future version, ctypes gives you no way of finding out until some poor user's process segfaults, and usually not at the relevant call site. So we'd prefer not to maintain more ctypes-using code. The APIs in pywin32 very closely mirror the underlying Windows API, so for addressing this issue, please just go ahead and use pywin32 APIs; porting them to a new API along with everything else should be relatively straightforward. If we do move forward with that change, we will probably use https://pypi.python.org/pypi/pywincffi and not move anything within Twisted.
What would you say the way forward is? Should I submit the patch for review anyway? Is there any other work that needs to be done first that I may contribute to?
Yes, just go ahead and write the patch.
Thanks in advance for Twisted and for your feedback,
Thanks for using Twisted, thanks doubly for trying to contribute, and very sorry for the slow response. -glyph
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
On Thu, Mar 31, 2016 at 10:27 PM, Glyph <glyph@twistedmatrix.com> wrote:
Agreed. My code does not go into 'childFDs' mapping territory, though. All it does is prevent all file- and socket-handles from being inherited by the child process -- other than the pipes used for STDIO communication. My patch calls a few Windows APIs via ctypes, however, as far as I can
Agreed with your ctypes comment -- I've been hit by such faults which "magically" went away using cffi when coding against Windows TAPI. pywin32, unfortunatelly, does not include two Windows APIs (out of four) my code requires -- I just grepped the source for latest release I could find on SourceForge, 220. For completeness, the missing APIs are NtQuerySystemInformation [1] and NtQueryObject [2]. The others are GetHandleInformation [3] and SetHandleInformation [4]. What would you say the way forward is? Should I submit the patch for review
Given that pywin32 does not provide two of the required APIs, maybe this issue is somewhat blocked. Adding to that is the fact that one particular API call in my code -- NtQuerySystemInformation [1] -- is being used with what seems to be an undocumented option -- SystemHandleInformation (enum = 16) -- and returning, again, an apparently undocumented data structure -- SYSTEM_HANDLE_INFORMATION. I downloaded and installed the available SDKs and WDKs (driver dev kits) from Microsoft and could not find any reference to those particular options or data structures. My code was created after much investigation on how to obtain the list of open handles for the current process. The gist of it is: - Call NtQuerySystemInformation with the SystemInformationClass arg set to SystemHandleInformation. - This returns all (!!!) of the handles in the system (no need for special privileges). - Filter those out by the current process PID and type, such that only files and sockets are left. - Use the GetHandleInformation to get the inheritance flag and clear it with SetHandleInformation if needed. It is mostly based on SysInternals information at http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html. There are many other references across the web to those undocumented options and data structures. But none of those come from formal Microsoft documents, that I could find. An alternative approach, which I also tried, for completeness sake, was to try and Get/SetHandleInformation on all possible handles -- this is completely unfeasible given that handles are at least 32 bit numbers. After all of this -- including some frustration, I confess -- I decided to go ahead and create a cffi ABI-mode variation of my original patch, anyway: it passes the same tests and, much like the ctypes approach, works nicely on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python 2.7.11 32, Twisted 16.1, cffi 1.5.2. Just for kicks I compared the performance of the ctypes vs cffi implementation: - The ctypes code runs in 0.014s - 0.016s. - The cffi code runs in 0.03s - 0.04s. This makes sense given that the code is mostly calling out to DLLs and, AFAICT, cffi does the nice extra work of validating/converting types back and forth. Wrapping up: I'm really not sure how to more forward with this: not only does pywin32 not provide the needed APIs, but also one of those APIs -- documented -- is being used in an undocumented fashion. Even though I'd love to submit a patch, I don't think we're at that point yet. However, for posterity's sake and if anyone wants to take a look at the code, it is avalable at https://github.com/exvito/twisted in branches win32-fix-handle-inherit-cffi and win32-fix-handle-inherit-ctypes. They add two tests to twisted/test/test_process.py, one line to twisted/internet/_dumbwin32proc.py and one module named twisted/internet/_win32handleinherit.py I'd love to hear feedback or ideas on this. Thanks again [1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms724509(v=vs.85).a... [2] https://msdn.microsoft.com/en-us/library/bb432383(v=vs.85).aspx [3] https://msdn.microsoft.com/en-us/library/windows/desktop/ms724329(v=vs.85).a... [4] https://msdn.microsoft.com/en-us/library/windows/desktop/ms724935(v=vs.85).a... -- exvito
![](https://secure.gravatar.com/avatar/f921b2aa2b18f93626e92d94bf5d6b40.jpg?s=120&d=mm&r=g)
Owner of pywincffi here, I'd certainly welcome a PR or two for pywincffi with the necessary functions/constants/etc so you don't have to manage that code and I'd be happy to help write it too. I think the consensus is Twisted is going to eventually replace calls into pywin32 with calls into pywincffi rather than implement all of that code inside of Twisted itself. I've already started doing this in a couple of places, twisted.python.lockfile <https://github.com/twisted/twisted/compare/trunk...opalmer:windows-cffi> being the one I'm actively working on because it's simpler to start with. But the code inside of dumbwin32proc.py and _win32handleinherit.py are both high on my list to convert too so it probably makes sense that we work on this together if you're open to it. Anyway as for some of your questions and points I've tried to reply to some of them below. I will admit, it's slanted towards making as much of this a possible a part of pywincffi having plugged away at the 'what to do about pywin32' and 'how to make Windows API calls more pythonic' quite a bit now so keep that in mind.
Looking at your code, some of it could be put into pywincffi already. It would need more tests and some additional code so the API calls are closer to what's already in the project (type checking, default arguments, documentation, etc) but overall it seems like you've already done the major work of understanding how it all fits together. The other advantages of putting this code into pywincffi is testing and releases are easier because the project is using AppVeyor to test all PRs and build the wheel files for most major Python versions including both 32 and 64 bit variants. From Twisted's perspective, it's just a dependency on another library.
Just for kicks I compared the performance of the ctypes vs cffi implementation: ...
Have you tried a comparison between out-of-line modules and those using dlopen? I imagine they'd end up being pretty similar in the end performance wise but I am a little curious. In pywincffi I started out using dlopen but moved away from it because I needed to write some extra code which couldn't be included from a header. The other advantage I saw is that you don't have to rely on the DLL being present and/or Windows being able to locate it so you can include code which might only be available if you have some extra library installed.
It usually does handle type conversion nicely but it can also do odd things depending on the Python version (like bytes vs. unicode vs. strings) if you're not careful. The other issue is user input which can sometimes result in odd error messages at the C-level that are difficult to understand or correct without specific knowledge of the underlying code. In non-public APIs this is not as big of an issue because the code is task specific but in something like pywincffi or Twisted where people can use APIs in unexpected ways the automatic conversion/validation can be somewhat lacking (again, something I wanted to centralize and improve in pywincffi). options and data structures. But none of those come from formal Microsoft documents, that I could find. Some of the people on that site either have contacts within Microsoft or have worked for Microsoft at one point so I usually trust what's there if it's the only source. The other place I often look is the ReactOS project where they've managed to reverse engineer quite a bit of the Windows kernel which can either hint at the info you need or validate what you already know.
IMHO (again, with some bias), I think implementing the calls you need in pywincffi is the first step. If the calls are undocumented it would also be a good place to do the necessary research, testing and development I think in isolation from Twisted itself so it's clear we're going in the right direction. Once that's done a patch set for Twisted, which calls into pywincffi, can be opened and tested across the supported platforms. This makes the patch set smaller but also makes it easier to understand what if anything the new code breaks. Regardless, even if you don't want to go the route of putting this into pywincffi thanks for working on this because it helps in some of the work I'm doing too. ---Oliver On Tue, Apr 5, 2016 at 5:11 PM, exvito here <ex.vitorino@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
On Thu, Apr 7, 2016 at 4:57 AM, Oliver Palmer <oliverpalmer@opalmer.com> wrote:
Certainly. Thanks for reaching out.
Agreed. I'll be happy to follow your guidance in that.
I did not and I confess that I haven't fully grasped (yet) the different cffi "approaches", if I may call them so.
AFAICT, SysInternals was bought by / integrated with Microsoft: tteir tools are now available under Microsoft domains (example: Process Explorer at https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx).
Sounds like a sane and safe approach.
Thank you to, again. I will issue PRs against pywincffi such that the APIs are available. They most probably won't be up to pywincffi's standards/requirements but I expect we can cooperate under that context and go from there. -- exvito
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
For completeness, if anyone wants to follow-along or contribute, first PR is https://github.com/opalmer/pywincffi/pull/66. Work will continue there. Once it becomes "Twisted ready" we'll bring the discussion back here. Thanks to all. -- exvito
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Feb 21, 2016, at 7:14 PM, exvito here <ex.vitorino@gmail.com> wrote:
Hello all,
I filed https://twistedmatrix.com/trac/ticket/7970 <https://twistedmatrix.com/trac/ticket/7970> last summer while focused on a particular project that was hit by it. Since then I had to attend to other things and I'm now back to refocusing on that project.
Thanks for following up. Sorry I've taken so long to respond; this is a highly complex issue that is tempting to just punt on thinking about. But we really should address it, and thank you for writing a great bug report.
Before submitting a patch for review, I'm looking for preliminary feedback, assuming you agree that the Windows vs POSIX semantics should be the same (if not, why?).
After much thought: Yes. They should be the same. The reason they're not is largely ignorance of the relevant APIs and abstractions on Windows, not any desire that they differ. The one place they have to differ a little bit is handle inheritance: we need to figure out some way to express the 'childFDs' mapping in terms of both file descriptors and handles.
My patch calls a few Windows APIs via ctypes, however, as far as I can tell, Twisted on Windows requires pywin32 and, recently, there has been some discussion around dropping that dependency and moving towards something based on cffi.
ctypes is dangerous and error-prone. If you screw up the bit-width of a type, or the type in a header changes on some future version, ctypes gives you no way of finding out until some poor user's process segfaults, and usually not at the relevant call site. So we'd prefer not to maintain more ctypes-using code. The APIs in pywin32 very closely mirror the underlying Windows API, so for addressing this issue, please just go ahead and use pywin32 APIs; porting them to a new API along with everything else should be relatively straightforward. If we do move forward with that change, we will probably use https://pypi.python.org/pypi/pywincffi and not move anything within Twisted.
What would you say the way forward is? Should I submit the patch for review anyway? Is there any other work that needs to be done first that I may contribute to?
Yes, just go ahead and write the patch.
Thanks in advance for Twisted and for your feedback,
Thanks for using Twisted, thanks doubly for trying to contribute, and very sorry for the slow response. -glyph
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
On Thu, Mar 31, 2016 at 10:27 PM, Glyph <glyph@twistedmatrix.com> wrote:
Agreed. My code does not go into 'childFDs' mapping territory, though. All it does is prevent all file- and socket-handles from being inherited by the child process -- other than the pipes used for STDIO communication. My patch calls a few Windows APIs via ctypes, however, as far as I can
Agreed with your ctypes comment -- I've been hit by such faults which "magically" went away using cffi when coding against Windows TAPI. pywin32, unfortunatelly, does not include two Windows APIs (out of four) my code requires -- I just grepped the source for latest release I could find on SourceForge, 220. For completeness, the missing APIs are NtQuerySystemInformation [1] and NtQueryObject [2]. The others are GetHandleInformation [3] and SetHandleInformation [4]. What would you say the way forward is? Should I submit the patch for review
Given that pywin32 does not provide two of the required APIs, maybe this issue is somewhat blocked. Adding to that is the fact that one particular API call in my code -- NtQuerySystemInformation [1] -- is being used with what seems to be an undocumented option -- SystemHandleInformation (enum = 16) -- and returning, again, an apparently undocumented data structure -- SYSTEM_HANDLE_INFORMATION. I downloaded and installed the available SDKs and WDKs (driver dev kits) from Microsoft and could not find any reference to those particular options or data structures. My code was created after much investigation on how to obtain the list of open handles for the current process. The gist of it is: - Call NtQuerySystemInformation with the SystemInformationClass arg set to SystemHandleInformation. - This returns all (!!!) of the handles in the system (no need for special privileges). - Filter those out by the current process PID and type, such that only files and sockets are left. - Use the GetHandleInformation to get the inheritance flag and clear it with SetHandleInformation if needed. It is mostly based on SysInternals information at http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html. There are many other references across the web to those undocumented options and data structures. But none of those come from formal Microsoft documents, that I could find. An alternative approach, which I also tried, for completeness sake, was to try and Get/SetHandleInformation on all possible handles -- this is completely unfeasible given that handles are at least 32 bit numbers. After all of this -- including some frustration, I confess -- I decided to go ahead and create a cffi ABI-mode variation of my original patch, anyway: it passes the same tests and, much like the ctypes approach, works nicely on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python 2.7.11 32, Twisted 16.1, cffi 1.5.2. Just for kicks I compared the performance of the ctypes vs cffi implementation: - The ctypes code runs in 0.014s - 0.016s. - The cffi code runs in 0.03s - 0.04s. This makes sense given that the code is mostly calling out to DLLs and, AFAICT, cffi does the nice extra work of validating/converting types back and forth. Wrapping up: I'm really not sure how to more forward with this: not only does pywin32 not provide the needed APIs, but also one of those APIs -- documented -- is being used in an undocumented fashion. Even though I'd love to submit a patch, I don't think we're at that point yet. However, for posterity's sake and if anyone wants to take a look at the code, it is avalable at https://github.com/exvito/twisted in branches win32-fix-handle-inherit-cffi and win32-fix-handle-inherit-ctypes. They add two tests to twisted/test/test_process.py, one line to twisted/internet/_dumbwin32proc.py and one module named twisted/internet/_win32handleinherit.py I'd love to hear feedback or ideas on this. Thanks again [1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms724509(v=vs.85).a... [2] https://msdn.microsoft.com/en-us/library/bb432383(v=vs.85).aspx [3] https://msdn.microsoft.com/en-us/library/windows/desktop/ms724329(v=vs.85).a... [4] https://msdn.microsoft.com/en-us/library/windows/desktop/ms724935(v=vs.85).a... -- exvito
![](https://secure.gravatar.com/avatar/f921b2aa2b18f93626e92d94bf5d6b40.jpg?s=120&d=mm&r=g)
Owner of pywincffi here, I'd certainly welcome a PR or two for pywincffi with the necessary functions/constants/etc so you don't have to manage that code and I'd be happy to help write it too. I think the consensus is Twisted is going to eventually replace calls into pywin32 with calls into pywincffi rather than implement all of that code inside of Twisted itself. I've already started doing this in a couple of places, twisted.python.lockfile <https://github.com/twisted/twisted/compare/trunk...opalmer:windows-cffi> being the one I'm actively working on because it's simpler to start with. But the code inside of dumbwin32proc.py and _win32handleinherit.py are both high on my list to convert too so it probably makes sense that we work on this together if you're open to it. Anyway as for some of your questions and points I've tried to reply to some of them below. I will admit, it's slanted towards making as much of this a possible a part of pywincffi having plugged away at the 'what to do about pywin32' and 'how to make Windows API calls more pythonic' quite a bit now so keep that in mind.
Looking at your code, some of it could be put into pywincffi already. It would need more tests and some additional code so the API calls are closer to what's already in the project (type checking, default arguments, documentation, etc) but overall it seems like you've already done the major work of understanding how it all fits together. The other advantages of putting this code into pywincffi is testing and releases are easier because the project is using AppVeyor to test all PRs and build the wheel files for most major Python versions including both 32 and 64 bit variants. From Twisted's perspective, it's just a dependency on another library.
Just for kicks I compared the performance of the ctypes vs cffi implementation: ...
Have you tried a comparison between out-of-line modules and those using dlopen? I imagine they'd end up being pretty similar in the end performance wise but I am a little curious. In pywincffi I started out using dlopen but moved away from it because I needed to write some extra code which couldn't be included from a header. The other advantage I saw is that you don't have to rely on the DLL being present and/or Windows being able to locate it so you can include code which might only be available if you have some extra library installed.
It usually does handle type conversion nicely but it can also do odd things depending on the Python version (like bytes vs. unicode vs. strings) if you're not careful. The other issue is user input which can sometimes result in odd error messages at the C-level that are difficult to understand or correct without specific knowledge of the underlying code. In non-public APIs this is not as big of an issue because the code is task specific but in something like pywincffi or Twisted where people can use APIs in unexpected ways the automatic conversion/validation can be somewhat lacking (again, something I wanted to centralize and improve in pywincffi). options and data structures. But none of those come from formal Microsoft documents, that I could find. Some of the people on that site either have contacts within Microsoft or have worked for Microsoft at one point so I usually trust what's there if it's the only source. The other place I often look is the ReactOS project where they've managed to reverse engineer quite a bit of the Windows kernel which can either hint at the info you need or validate what you already know.
IMHO (again, with some bias), I think implementing the calls you need in pywincffi is the first step. If the calls are undocumented it would also be a good place to do the necessary research, testing and development I think in isolation from Twisted itself so it's clear we're going in the right direction. Once that's done a patch set for Twisted, which calls into pywincffi, can be opened and tested across the supported platforms. This makes the patch set smaller but also makes it easier to understand what if anything the new code breaks. Regardless, even if you don't want to go the route of putting this into pywincffi thanks for working on this because it helps in some of the work I'm doing too. ---Oliver On Tue, Apr 5, 2016 at 5:11 PM, exvito here <ex.vitorino@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
On Thu, Apr 7, 2016 at 4:57 AM, Oliver Palmer <oliverpalmer@opalmer.com> wrote:
Certainly. Thanks for reaching out.
Agreed. I'll be happy to follow your guidance in that.
I did not and I confess that I haven't fully grasped (yet) the different cffi "approaches", if I may call them so.
AFAICT, SysInternals was bought by / integrated with Microsoft: tteir tools are now available under Microsoft domains (example: Process Explorer at https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx).
Sounds like a sane and safe approach.
Thank you to, again. I will issue PRs against pywincffi such that the APIs are available. They most probably won't be up to pywincffi's standards/requirements but I expect we can cooperate under that context and go from there. -- exvito
![](https://secure.gravatar.com/avatar/3d38cd80c2089b2b28ce28ae48ad48f8.jpg?s=120&d=mm&r=g)
For completeness, if anyone wants to follow-along or contribute, first PR is https://github.com/opalmer/pywincffi/pull/66. Work will continue there. Once it becomes "Twisted ready" we'll bring the discussion back here. Thanks to all. -- exvito
participants (3)
-
exvito here
-
Glyph
-
Oliver Palmer