Portable "spawn" module for core?

Hi all -- it recently occured to me that the 'spawn' module I wrote for the Distutils (and which Perry Stoll extended to handle NT), could fit nicely in the core library. On Unix, it's just a front-end to fork-and-exec; on NT, it's a front-end to spawnv(). In either case, it's just enough code (and just tricky enough code) that not everybody should have to duplicate it for their own uses. The basic idea is this: from spawn import spawn ... spawn (['cmd', 'arg1', 'arg2']) # or spawn (['cmd'] + args) you get the idea: it takes a *list* representing the command to spawn: no strings to parse, no shells to get in the way, no sneaky meta-characters ruining your day, draining your efficiency, or compromising your security. (Conversely, no pipelines, redirection, etc.) The 'spawn()' function just calls '_spawn_posix()' or '_spawn_nt()' depending on os.name. Additionally, it takes a couple of optional keyword arguments (all booleans): 'search_path', 'verbose', and 'dry_run', which do pretty much what you'd expect. The module as it's currently in the Distutils code is attached. Let me know what you think... Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913

Greg> it recently occured to me that the 'spawn' module I wrote for the Greg> Distutils (and which Perry Stoll extended to handle NT), could fit Greg> nicely in the core library. How's spawn.spawn semantically different from the Windows-dependent os.spawn? How are stdout/stdin/stderr connected to the child process - just like fork+exec or something slightly higher level like os.popen? If it's semantically like os.spawn and a little bit higher level abstraction than fork+exec, I'd vote for having the os module simply import it: from spawn import spawn and thus make that function more widely available... Greg> The module as it's currently in the Distutils code is attached. Not in the message I saw... Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

On 30 August 1999, Skip Montanaro said:
My understanding (purely from reading Perry's code!) is that the Windows spawnv() and spawnve() calls require the full path of the executable, and there is no spawnvp(). Hence, the bulk of Perry's '_spawn_nt()' function is code to search the system path if the 'search_path' flag is true. In '_spawn_posix()', I just use either 'execv()' or 'execvp()' for this. The bulk of my code is the complicated dance required to wait for a fork'ed child process to finish.
How are stdout/stdin/stderr connected to the child process - just like fork+exec or something slightly higher level like os.popen?
Just like fork 'n exec -- '_spawn_posix()' is just a front end to fork and exec (either execv or execvp). In a previous life, I *did* implement a spawning module for a certain other popular scripting language that handles redirection and capturing (backticks in the shell and that other scripting language). It was a lot of fun, but pretty hairy. Took three attempts gradually developed over two years to get it right in the end. In fact, it does all the easy stuff that a Unix shell does in spawning commands, ie. search the path, fork 'n exec, and redirection and capturing. Doesn't handle the tricky stuff, ie. pipelines and job control. The documentation for this module is 22 pages long; the code is 600+ lines of somewhat tricky Perl (1300 lines if you leave in comments and blank lines). That's why the Distutils spawn module doesn't do anything with std{out,err,in}.
So os.spawnv and os.spawnve would be Windows-specific, but os.spawn portable? Could be confusing. And despite the recent extended discussion of the os module, I'm not sure if this fits the model. BTW, is there anything like this on the Mac? On what other OSs does it even make sense to talk about programs spawning other programs? (Surely those GUI user interfaces have to do *something*...) Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913

Greg> BTW, is there anything like this on the Mac? There will be, once Jack Jansen contributes _spawn_mac... ;-) Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

[Greg Ward]
Note that win/tclWinPipe.c-- which contains the Windows-specific support for Tcl's "exec" cmd --is about 3,200 lines of C. It does handle pipelines and redirection, and even fakes pipes as needed with temp files when it can identify a pipeline component as belonging to the 16-bit subsystem. Even so, the Tcl help page for "exec" bristles with hilarious caveats under the Windows subsection; e.g., When redirecting from NUL:, some applications may hang, others will get an infinite stream of "0x01" bytes, and some will actually correctly get an immediate end-of-file; the behavior seems to depend upon something compiled into the application itself. When redirecting greater than 4K or so to NUL:, some applications will hang. The above problems do not happen with 32-bit applications. Still, people seem very happy with Tcl's exec, and I'm certain no language tries harder to provide a portable way to "do command lines". Two points to that: 1) If Python ever wants to do something similar, let's steal the Tcl code (& unlike stealing Perl's code, stealing Tcl's code actually looks possible -- it's very much better organized and written). 2) For all its heroic efforts to hide platform limitations, int Tcl_ExecObjCmd(dummy, interp, objc, objv) ClientData dummy; /* Not used. */ Tcl_Interp *interp; /* Current interpreter. */ int objc; /* Number of arguments. */ Tcl_Obj *CONST objv[]; /* Argument objects. */ { #ifdef MAC_TCL Tcl_AppendResult(interp, "exec not implemented under Mac OS", (char *)NULL); return TCL_ERROR; #else ... a-generalized-spawn-is-a-good-start-ly y'rs - tim

On 30 August 1999, To python-dev@python.org said:
The module as it's currently in the Distutils code is attached. Let me know what you think...
New definition of "attached": I'll just reply to my own message with the code I meant to attach. D'oh! ------------------------------------------------------------------------ """distutils.spawn Provides the 'spawn()' function, a front-end to various platform- specific functions for launching another program in a sub-process.""" # created 1999/07/24, Greg Ward __rcsid__ = "$Id: spawn.py,v 1.2 1999/08/29 18:20:56 gward Exp $" import sys, os, string from distutils.errors import * def spawn (cmd, search_path=1, verbose=0, dry_run=0): """Run another program, specified as a command list 'cmd', in a new process. 'cmd' is just the argument list for the new process, ie. cmd[0] is the program to run and cmd[1:] are the rest of its arguments. There is no way to run a program with a name different from that of its executable. If 'search_path' is true (the default), the system's executable search path will be used to find the program; otherwise, cmd[0] must be the exact path to the executable. If 'verbose' is true, a one-line summary of the command will be printed before it is run. If 'dry_run' is true, the command will not actually be run. Raise DistutilsExecError if running the program fails in any way; just return on success.""" if os.name == 'posix': _spawn_posix (cmd, search_path, verbose, dry_run) elif os.name in ( 'nt', 'windows' ): # ??? _spawn_nt (cmd, search_path, verbose, dry_run) else: raise DistutilsPlatformError, \ "don't know how to spawn programs on platform '%s'" % os.name # spawn () def _spawn_nt ( cmd, search_path=1, verbose=0, dry_run=0): import string executable = cmd[0] if search_path: paths = string.split( os.environ['PATH'], os.pathsep) base,ext = os.path.splitext(executable) if (ext != '.exe'): executable = executable + '.exe' if not os.path.isfile(executable): paths.reverse() # go over the paths and keep the last one for p in paths: f = os.path.join( p, executable ) if os.path.isfile ( f ): # the file exists, we have a shot at spawn working executable = f if verbose: print string.join ( [executable] + cmd[1:], ' ') if not dry_run: # spawn for NT requires a full path to the .exe rc = os.spawnv (os.P_WAIT, executable, cmd) if rc != 0: raise DistutilsExecError("command failed: %d" % rc) def _spawn_posix (cmd, search_path=1, verbose=0, dry_run=0): if verbose: print string.join (cmd, ' ') if dry_run: return exec_fn = search_path and os.execvp or os.execv pid = os.fork () if pid == 0: # in the child try: #print "cmd[0] =", cmd[0] #print "cmd =", cmd exec_fn (cmd[0], cmd) except OSError, e: sys.stderr.write ("unable to execute %s: %s\n" % (cmd[0], e.strerror)) os._exit (1) sys.stderr.write ("unable to execute %s for unknown reasons" % cmd[0]) os._exit (1) else: # in the parent # Loop until the child either exits or is terminated by a signal # (ie. keep waiting if it's merely stopped) while 1: (pid, status) = os.waitpid (pid, 0) if os.WIFSIGNALED (status): raise DistutilsExecError, \ "command %s terminated by signal %d" % \ (cmd[0], os.WTERMSIG (status)) elif os.WIFEXITED (status): exit_status = os.WEXITSTATUS (status) if exit_status == 0: return # hey, it succeeded! else: raise DistutilsExecError, \ "command %s failed with exit status %d" % \ (cmd[0], exit_status) elif os.WIFSTOPPED (status): continue else: raise DistutilsExecError, \ "unknown error executing %s: termination status %d" % \ (cmd[0], status) # _spawn_posix () ------------------------------------------------------------------------ -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913

I'm not sure that the verbose and dry_run options belong in the standard library. When both are given, this does something semi-useful; for Posix that's basically just printing the arguments, while for NT it prints the exact command that will be executed. Not sure if that's significant though. Perhaps it's better to extract the code that runs the path to find the right executable and make that into a separate routine. (Also, rather than reversing the path, I would break out of the loop at the first hit.) --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ward <gward@cnri.reston.va.us> wrote:
any reason this couldn't go into the os module instead? just add parts of it to os.py, and change the docs to say that spawn* are supported on Windows and Unix... (supporting the full set of spawn* primitives would of course be nice, btw. just like os.py provides all exec variants...) </F>

Greg> it recently occured to me that the 'spawn' module I wrote for the Greg> Distutils (and which Perry Stoll extended to handle NT), could fit Greg> nicely in the core library. How's spawn.spawn semantically different from the Windows-dependent os.spawn? How are stdout/stdin/stderr connected to the child process - just like fork+exec or something slightly higher level like os.popen? If it's semantically like os.spawn and a little bit higher level abstraction than fork+exec, I'd vote for having the os module simply import it: from spawn import spawn and thus make that function more widely available... Greg> The module as it's currently in the Distutils code is attached. Not in the message I saw... Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

On 30 August 1999, Skip Montanaro said:
My understanding (purely from reading Perry's code!) is that the Windows spawnv() and spawnve() calls require the full path of the executable, and there is no spawnvp(). Hence, the bulk of Perry's '_spawn_nt()' function is code to search the system path if the 'search_path' flag is true. In '_spawn_posix()', I just use either 'execv()' or 'execvp()' for this. The bulk of my code is the complicated dance required to wait for a fork'ed child process to finish.
How are stdout/stdin/stderr connected to the child process - just like fork+exec or something slightly higher level like os.popen?
Just like fork 'n exec -- '_spawn_posix()' is just a front end to fork and exec (either execv or execvp). In a previous life, I *did* implement a spawning module for a certain other popular scripting language that handles redirection and capturing (backticks in the shell and that other scripting language). It was a lot of fun, but pretty hairy. Took three attempts gradually developed over two years to get it right in the end. In fact, it does all the easy stuff that a Unix shell does in spawning commands, ie. search the path, fork 'n exec, and redirection and capturing. Doesn't handle the tricky stuff, ie. pipelines and job control. The documentation for this module is 22 pages long; the code is 600+ lines of somewhat tricky Perl (1300 lines if you leave in comments and blank lines). That's why the Distutils spawn module doesn't do anything with std{out,err,in}.
So os.spawnv and os.spawnve would be Windows-specific, but os.spawn portable? Could be confusing. And despite the recent extended discussion of the os module, I'm not sure if this fits the model. BTW, is there anything like this on the Mac? On what other OSs does it even make sense to talk about programs spawning other programs? (Surely those GUI user interfaces have to do *something*...) Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913

Greg> BTW, is there anything like this on the Mac? There will be, once Jack Jansen contributes _spawn_mac... ;-) Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

[Greg Ward]
Note that win/tclWinPipe.c-- which contains the Windows-specific support for Tcl's "exec" cmd --is about 3,200 lines of C. It does handle pipelines and redirection, and even fakes pipes as needed with temp files when it can identify a pipeline component as belonging to the 16-bit subsystem. Even so, the Tcl help page for "exec" bristles with hilarious caveats under the Windows subsection; e.g., When redirecting from NUL:, some applications may hang, others will get an infinite stream of "0x01" bytes, and some will actually correctly get an immediate end-of-file; the behavior seems to depend upon something compiled into the application itself. When redirecting greater than 4K or so to NUL:, some applications will hang. The above problems do not happen with 32-bit applications. Still, people seem very happy with Tcl's exec, and I'm certain no language tries harder to provide a portable way to "do command lines". Two points to that: 1) If Python ever wants to do something similar, let's steal the Tcl code (& unlike stealing Perl's code, stealing Tcl's code actually looks possible -- it's very much better organized and written). 2) For all its heroic efforts to hide platform limitations, int Tcl_ExecObjCmd(dummy, interp, objc, objv) ClientData dummy; /* Not used. */ Tcl_Interp *interp; /* Current interpreter. */ int objc; /* Number of arguments. */ Tcl_Obj *CONST objv[]; /* Argument objects. */ { #ifdef MAC_TCL Tcl_AppendResult(interp, "exec not implemented under Mac OS", (char *)NULL); return TCL_ERROR; #else ... a-generalized-spawn-is-a-good-start-ly y'rs - tim

On 30 August 1999, To python-dev@python.org said:
The module as it's currently in the Distutils code is attached. Let me know what you think...
New definition of "attached": I'll just reply to my own message with the code I meant to attach. D'oh! ------------------------------------------------------------------------ """distutils.spawn Provides the 'spawn()' function, a front-end to various platform- specific functions for launching another program in a sub-process.""" # created 1999/07/24, Greg Ward __rcsid__ = "$Id: spawn.py,v 1.2 1999/08/29 18:20:56 gward Exp $" import sys, os, string from distutils.errors import * def spawn (cmd, search_path=1, verbose=0, dry_run=0): """Run another program, specified as a command list 'cmd', in a new process. 'cmd' is just the argument list for the new process, ie. cmd[0] is the program to run and cmd[1:] are the rest of its arguments. There is no way to run a program with a name different from that of its executable. If 'search_path' is true (the default), the system's executable search path will be used to find the program; otherwise, cmd[0] must be the exact path to the executable. If 'verbose' is true, a one-line summary of the command will be printed before it is run. If 'dry_run' is true, the command will not actually be run. Raise DistutilsExecError if running the program fails in any way; just return on success.""" if os.name == 'posix': _spawn_posix (cmd, search_path, verbose, dry_run) elif os.name in ( 'nt', 'windows' ): # ??? _spawn_nt (cmd, search_path, verbose, dry_run) else: raise DistutilsPlatformError, \ "don't know how to spawn programs on platform '%s'" % os.name # spawn () def _spawn_nt ( cmd, search_path=1, verbose=0, dry_run=0): import string executable = cmd[0] if search_path: paths = string.split( os.environ['PATH'], os.pathsep) base,ext = os.path.splitext(executable) if (ext != '.exe'): executable = executable + '.exe' if not os.path.isfile(executable): paths.reverse() # go over the paths and keep the last one for p in paths: f = os.path.join( p, executable ) if os.path.isfile ( f ): # the file exists, we have a shot at spawn working executable = f if verbose: print string.join ( [executable] + cmd[1:], ' ') if not dry_run: # spawn for NT requires a full path to the .exe rc = os.spawnv (os.P_WAIT, executable, cmd) if rc != 0: raise DistutilsExecError("command failed: %d" % rc) def _spawn_posix (cmd, search_path=1, verbose=0, dry_run=0): if verbose: print string.join (cmd, ' ') if dry_run: return exec_fn = search_path and os.execvp or os.execv pid = os.fork () if pid == 0: # in the child try: #print "cmd[0] =", cmd[0] #print "cmd =", cmd exec_fn (cmd[0], cmd) except OSError, e: sys.stderr.write ("unable to execute %s: %s\n" % (cmd[0], e.strerror)) os._exit (1) sys.stderr.write ("unable to execute %s for unknown reasons" % cmd[0]) os._exit (1) else: # in the parent # Loop until the child either exits or is terminated by a signal # (ie. keep waiting if it's merely stopped) while 1: (pid, status) = os.waitpid (pid, 0) if os.WIFSIGNALED (status): raise DistutilsExecError, \ "command %s terminated by signal %d" % \ (cmd[0], os.WTERMSIG (status)) elif os.WIFEXITED (status): exit_status = os.WEXITSTATUS (status) if exit_status == 0: return # hey, it succeeded! else: raise DistutilsExecError, \ "command %s failed with exit status %d" % \ (cmd[0], exit_status) elif os.WIFSTOPPED (status): continue else: raise DistutilsExecError, \ "unknown error executing %s: termination status %d" % \ (cmd[0], status) # _spawn_posix () ------------------------------------------------------------------------ -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913

I'm not sure that the verbose and dry_run options belong in the standard library. When both are given, this does something semi-useful; for Posix that's basically just printing the arguments, while for NT it prints the exact command that will be executed. Not sure if that's significant though. Perhaps it's better to extract the code that runs the path to find the right executable and make that into a separate routine. (Also, rather than reversing the path, I would break out of the loop at the first hit.) --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ward <gward@cnri.reston.va.us> wrote:
any reason this couldn't go into the os module instead? just add parts of it to os.py, and change the docs to say that spawn* are supported on Windows and Unix... (supporting the full set of spawn* primitives would of course be nice, btw. just like os.py provides all exec variants...) </F>
participants (5)
-
Fredrik Lundh
-
Greg Ward
-
Guido van Rossum
-
Skip Montanaro
-
Tim Peters