Re: [Python-Dev] Issue 13524: subprocess on Windows

I tripped over this one trying to make one of our Python at work Windows compatible. We had no idea that a magic 'SystemRoot' environment variable would be required, and it was causing issues for pyzmq. It might be nice to reflect the findings of this email thread on the subprocess documentation page: http://docs.python.org/library/subprocess.html Currently the docs mention this: "Note If specified, env must provide any variables required for the program to execute. On Windows, in order to run a side-by-side assembly the specified env must include a valid SystemRoot." How about rewording that to: "Note If specified, env must provide any variables required for the program to execute. On Windows, a valid SystemRoot environment variable is required for some Python libraries such as the 'random' module. Also, in order to run a side-by-side assembly the specified env must include a valid SystemRoot."

On Mar 21, 2012, at 4:38 PM, Brad Allen wrote:
I tripped over this one trying to make one of our Python at work Windows compatible. We had no idea that a magic 'SystemRoot' environment variable would be required, and it was causing issues for pyzmq.
It might be nice to reflect the findings of this email thread on the subprocess documentation page:
http://docs.python.org/library/subprocess.html
Currently the docs mention this:
"Note If specified, env must provide any variables required for the program to execute. On Windows, in order to run a side-by-side assembly the specified env must include a valid SystemRoot."
How about rewording that to:
"Note If specified, env must provide any variables required for the program to execute. On Windows, a valid SystemRoot environment variable is required for some Python libraries such as the 'random' module. Also, in order to run a side-by-side assembly the specified env must include a valid SystemRoot."
Also, in order to execute in any installation environment where libraries are found in non-default locations, you will need to set LD_LIBRARY_PATH. Oh, and you will also need to set $PATH on UNIX so that libraries can find their helper programs and %PATH% on Windows so that any compiled dynamically-loadable modules and/or DLLs can be loaded. And by the way you will also need to relay DYLD_LIBRARY_PATH if you did a UNIX-style build on OS X, not LD_LIBRARY_PATH. Don't forget that you probably also need PYTHONPATH to make sure any subprocess environments can import the same modules as their parent. Not to mention SSH_AUTH_SOCK if your application requires access to _remote_ process spawning, rather than just local. Oh and DISPLAY in case your subprocesses need GUI support from an X11 program (which sometimes you need just to initialize certain libraries which don't actually do anything with a GUI). Oh and __CF_USER_TEXT_ENCODING is important sometimes too, don't forget that. And if your subprocess is in Perl or Ruby or Java you may need a couple dozen other variables which your deployment environment has set for you too. Did I mention CFLAGS or LC_ALL yet? Let me tell you a story about this one HP/UX machine... Ahem. Bottom line: it seems like screwing with the process spawning environment to make it minimal is a good idea for simplicity, for security, and for modularity. But take it from me, it isn't. I guarantee you that you don't actually know what is in your operating system's environment, and initializing it is a complicated many-step dance which some vendor or sysadmin or product integrator figured out how to do much better than your hapless Python program can. %SystemRoot% is just the tip of a very big, very nasty iceberg. Better not to keep refining why exactly it's required, or someone will eventually be adding a new variable (starting with %APPDATA% and %HOMEPATH%) that can magically cause your subprocess not to spawn properly to this page every six months for eternity. If you're spawning processes as a regular user, you should just take the environment you're given, perhaps with a few specific light additions whose meaning you understand. If you're spawning a process as an administrator or root, you should probably initialize the environment for the user you want to spawn that process as using an OS-specific mechanism like login(1). (Sorry that I don't know the Windows equivalent.) -glyph

On Thu, Mar 22, 2012 at 2:35 PM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:
Also, in order to execute in any installation environment where libraries are found in non-default locations, you will need to set LD_LIBRARY_PATH. Oh, and you will also need to set $PATH on UNIX so that libraries can find their helper programs and %PATH% on Windows so that any compiled dynamically-loadable modules and/or DLLs can be loaded. And by the way you will also need to relay DYLD_LIBRARY_PATH if you did a UNIX-style build on OS X, not LD_LIBRARY_PATH. Don't forget that you probably also need PYTHONPATH to make sure any subprocess environments can import the same modules as their parent. Not to mention SSH_AUTH_SOCK if your application requires access to _remote_ process spawning, rather than just local. Oh and DISPLAY in case your subprocesses need GUI support from an X11 program (which sometimes you need just to initialize certain libraries which don't actually do anything with a GUI). Oh and __CF_USER_TEXT_ENCODING is important sometimes too, don't forget that. And if your subprocess is in Perl or Ruby or Java you may need a couple dozen other variables which your deployment environment has set for you too. Did I mention CFLAGS or LC_ALL yet? Let me tell you a story about this one HP/UX machine...
Ahem.
Bottom line: it seems like screwing with the process spawning environment to make it minimal is a good idea for simplicity, for security, and for modularity. But take it from me, it isn't. I guarantee you that you don't actually know what is in your operating system's environment, and initializing it is a complicated many-step dance which some vendor or sysadmin or product integrator figured out how to do much better than your hapless Python program can.
%SystemRoot% is just the tip of a very big, very nasty iceberg. Better not to keep refining why exactly it's required, or someone will eventually be adding a new variable (starting with %APPDATA% and %HOMEPATH%) that can magically cause your subprocess not to spawn properly to this page every six months for eternity. If you're spawning processes as a regular user, you should just take the environment you're given, perhaps with a few specific light additions whose meaning you understand. If you're spawning a process as an administrator or root, you should probably initialize the environment for the user you want to spawn that process as using an OS-specific mechanism like login(1). (Sorry that I don't know the Windows equivalent.)
Thanks, Glyph. In that case maybe the Python subprocess docs need not single out SystemRoot, but instead plaster a big warning around the use of the 'env' parameter.: Here is what the docs currently state for the Popen constructor 'env' parameter:
If env is not None, it must be a mapping that defines the environment variables for the new process; these are used instead of inheriting the current process’ environment, which is the default behavior.
Note: If specified, env must provide any variables required for the program to execute. On Windows, in order to run a side-by-side assembly the specified env must include a valid SystemRoot.
The "Note" section could instead state something like: "In most cases, the child process will need many of the same environment variables as the current process. Usually the safest course of action is to build the env dict to contain all the same keys and values from os.environ. For example... <insert Glyph's examples here>"

On Mar 23, 2012, at 1:26 PM, Brad Allen wrote:
Thanks, Glyph. In that case maybe the Python subprocess docs need not single out SystemRoot, but instead plaster a big warning around the use of the 'env' parameter.
I agree. I'm glad that my bitter experience here might be useful to someone in the future - all those late nights trying desperately to get my unit tests to run on some newly configured, slightly weird buildbot didn't go to waste :).
The "Note" section could instead state something like: "In most cases, the child process will need many of the same environment variables as the current process. Usually the safest course of action is to build the env dict to contain all the same keys and values from os.environ. For example... <insert Glyph's examples here>"
I think including all the examples might be overstating the case. It is probably best to say that other operating systems, vendors, and integration tools may set necessary environment variables that there is no way for you to be aware of in advance, unless you are an expert sysadmin on every platform where you expect your code to run, and that many of these variables are required for libraries to function properly, both libraries bundled with python and those from third parties. -glyph

On Fri, Mar 23, 2012 at 3:46 PM, Glyph <glyph@twistedmatrix.com> wrote:
On Mar 23, 2012, at 1:26 PM, Brad Allen wrote:
Thanks, Glyph. In that case maybe the Python subprocess docs need not single out SystemRoot, but instead plaster a big warning around the use of the 'env' parameter.
I agree. I'm glad that my bitter experience here might be useful to someone in the future - all those late nights trying desperately to get my unit tests to run on some newly configured, slightly weird buildbot didn't go to waste :).
Ok, I'll open a ticket on the bugtracker for this over the weekend.
participants (3)
-
Brad Allen
-
Glyph
-
Glyph Lefkowitz