Re: subprocess - updated popen5 module

[From Mail to python-announce list]
I'd like to give you an updated status report of the popen5 project. Since my last post at 2004-06-04, this has happened: ... With these changes, the subprocess module now feels quite complete. In the near future, I will focus on getting the module included in the standard library.
I've recieved very positive feedback on my module. Many users are also asking me if this module will be included in the standard library. That is, of course, my wish as well. So, can the subprocess module be accepted? If not, what needs to be done? /Peter Åstrand <astrand@lysator.liu.se>

Peter Astrand <astrand@lysator.liu.se> writes:
[From Mail to python-announce list]
I'd like to give you an updated status report of the popen5 project. Since my last post at 2004-06-04, this has happened: ... With these changes, the subprocess module now feels quite complete. In the near future, I will focus on getting the module included in the standard library.
I've recieved very positive feedback on my module. Many users are also asking me if this module will be included in the standard library. That is, of course, my wish as well.
So, can the subprocess module be accepted? If not, what needs to be done?
I would suggest to replace the _subprocess extension module by a Python implementation based on ctypes (and include ctypes in 2.4, at least for Windows). How many _xxxx.pyd windows specific modules do we need? Thomas

Thomas Heller wrote:
How many _xxxx.pyd windows specific modules do we need?
as many as it takes to make Python an excellent tool. it's not like one more entry in the makefile, and 20k in the binary distribution will cause problem for anyone. if you want to stop contributions to Python's standard library unless they use your tools, you have a serious problem. </F>

"Fredrik Lundh" <fredrik@pythonware.com> writes:
Thomas Heller wrote:
How many _xxxx.pyd windows specific modules do we need?
as many as it takes to make Python an excellent tool. it's not like one more entry in the makefile, and 20k in the binary distribution will cause problem for anyone.
if you want to stop contributions to Python's standard library unless they use your tools, you have a serious problem.
Sorry if you misunderstood me, I didn't want to stop this contribution. Here is what I wrote, again:
I would suggest to replace the _subprocess extension module by a Python implementation based on ctypes (and include ctypes in 2.4, at least for Windows).
Thomas

Thomas Heller wrote:
if you want to stop contributions to Python's standard library unless they use your tools, you have a serious problem.
Sorry if you misunderstood me, I didn't want to stop this contribution. Here is what I wrote, again:
I would suggest to replace the _subprocess extension module by a Python implementation based on ctypes (and include ctypes in 2.4, at least for Windows).
if you're suggesting a rewrite for political reasons, you're effectively blocking this contribution. don't do that; python-dev have enough problems getting non- core contributions as it is. </F>

astrand@lysator.liu.se said:
I've recieved very positive feedback on my module. Many users are also asking me if this module will be included in the standard library. That is, of course, my wish as well.
So, can the subprocess module be accepted? If not, what needs to be done?
I've been watching the progression of subprocess with some interest. It looks encouraging, and is exactly the sort of thing I need for my work. One small nit I've noticed: aren't the names of subprocess.call() and subprocess.callv() reversed? If you look at unix execl() and execv(), execl() takes a variable-length argument list and execv() takes a list (vector?) of arguments. But it's the opposite for subprocess -- callv() takes a variable-length arg list and call() takes a list of args. Am I missing something? Can these be renamed now before it gets standardized? Jason

I've been watching the progression of subprocess with some interest. It looks encouraging, and is exactly the sort of thing I need for my work.
One small nit I've noticed: aren't the names of subprocess.call() and subprocess.callv() reversed? If you look at unix execl() and execv(), execl() takes a variable-length argument list and execv() takes a list (vector?) of arguments. But it's the opposite for subprocess -- callv() takes a variable-length arg list and call() takes a list of args.
Oh. Yes, you are right.
Am I missing something? Can these be renamed now before it gets standardized?
I'd prefer not to rename the call() function. The name is short and simple, and the function is very much used. I'm positive to renaming the callv() function, though. One obvious name would be "calll", but that's quite ugly. How about "lcall"? Then we can keep the "callv" name for backwards compatibility. Or, we could just keep the "callv" name, and pretend that "v" stands for "variable number of arguments". /Peter Åstrand <astrand@lysator.liu.se>

Peter Astrand wrote:
Am I missing something? Can these be renamed now before it gets standardized?
I'd prefer not to rename the call() function. The name is short and simple, and the function is very much used. I'm positive to renaming the callv() function, though. One obvious name would be "calll", but that's quite ugly. How about "lcall"? Then we can keep the "callv" name for backwards compatibility.
do we need both? you could rename "callv" to "call", and let people type an extra "*" if they want to pass in a list of arguments: subprocess.call("stty", "sane", "-F", device) subprocess.call(*["stty", "sane", "-F", device]) or, more likely: args = ["somecommand"] # several lines of code to add options and parameters # to the args list subprocess.call(*args)
Or, we could just keep the "callv" name, and pretend that "v" stands for "variable number of arguments".
I have no problem with that... </F>

On Sat, 9 Oct 2004, Fredrik Lundh wrote:
I'd prefer not to rename the call() function. The name is short and simple, and the function is very much used. I'm positive to renaming the callv() function, though. One obvious name would be "calll", but that's quite ugly. How about "lcall"? Then we can keep the "callv" name for backwards compatibility.
do we need both? you could rename "callv" to "call", and let people type an extra "*" if they want to pass in a list of arguments:
subprocess.call("stty", "sane", "-F", device) subprocess.call(*["stty", "sane", "-F", device])
I'd like "call" to stay as it is. It's interface is pretty clean, and maps directly to the Popen constructor. "callv" is only syntactic sugar. It's for people that thinks that: subprocess.call(["ls", "-l", "/foo/bar"]) ...is to ugly compared to: os.system("ls -l /foo/bar") With callv, it is: subprocess.callv("ls", "-l", "/foo/bar") ...which is slighly nicer. The drawback with callv is that it does not allow specifying the program and it's arguments as a whitespace-separated string: The entire (first) string would be intepreted as the executable. So, you cannot do: subprocess.callv("somewindowsprog.exe some strange command line") because then the system would try to execute a program called "somewindowsprog.exe some strange command line", which doesn't exist. You cannot do this either: subprocess.callv("somewindowsprog.exe", "some", "strange", "command", "line") ...if somewindowsprog.exe doesn't use the MS C runtime argument rules. To summarize: call() must stay as it is for completeness. callv() is just syntactic sugar, but probably deserves to stay as well. /Peter Åstrand <astrand@lysator.liu.se>

astrand@lysator.liu.se said:
...which is slighly nicer. The drawback with callv is that it does not allow specifying the program and it's arguments as a whitespace-separated string: The entire (first) string would be intepreted as the executable. So, you cannot do:
subprocess.callv("somewindowsprog.exe some strange command line")
because then the system would try to execute a program called "somewindowsprog.exe some strange command line", which doesn't exist. You cannot do this either:
subprocess.callv("somewindowsprog.exe", "some", "strange", "command", "line")
...if somewindowsprog.exe doesn't use the MS C runtime argument rules.
I'm not sure I understand what the MSC runtime has to do with the naming of call/callv. Your examples don't work with call either, right? Their call() equivalents: subprocess.call(["somewindowsprog.exe some strange command line"]) subprocess.call(["somewindowsprog.exe", "some", "strange", "command", "line"]) are just as broken, no? Overall, I agree that callv() is superfluous. In my programming, I always end up using the "v" variants of exec functions, because there's always _something_ you do to the command line first, and it's easier to handle arguments as a list. [The above paragraph makes my point: "I always use execv(), so we should drop subprocess.callv()?" The naming hurts my poor brain.] Jason

subprocess.callv("somewindowsprog.exe", "some", "strange", "command", "line")
...if somewindowsprog.exe doesn't use the MS C runtime argument rules.
I'm not sure I understand what the MSC runtime has to do with the naming of call/callv.
In that case, my explanation wasn't good enough :) It's somewhat complicated. Most people will never have any problems with these issues, but I've taken care so that the API should support all cornercases.
Your examples don't work with call either, right?
They work with call if you use a string argument. That's the core of the problem: The callv function doesn't support passing a string-type args argument to the Popen constructor.
Their call() equivalents:
subprocess.call(["somewindowsprog.exe some strange command line"]) subprocess.call(["somewindowsprog.exe", "some", "strange", "command", "line"])
are just as broken, no?
Yes. You'll need to do: subprocess.call("somewindowsprog.exe some strange command line") /Peter Åstrand <astrand@lysator.liu.se>

Quoting Peter Astrand <astrand@lysator.liu.se>:
Your examples don't work with call either, right?
They work with call if you use a string argument. That's the core of the problem: The callv function doesn't support passing a string-type args argument to the Popen constructor.
Their call() equivalents:
subprocess.call(["somewindowsprog.exe some strange command line"]) subprocess.call(["somewindowsprog.exe", "some", "strange", "command", "line"])
are just as broken, no?
Yes. You'll need to do:
subprocess.call("somewindowsprog.exe some strange command line")
Given that call is only a shortcut function, wouldn't the following suffice?: def call(*args, **kwds): if len(args) <= 1: return Popen(*args, **kwds) else: return Popen(args, **kwds) With that implementation, a single string, a list, a sequence of strings and Popen keywords only would all work as arguments to call. That is: call("ls -l") -> Popen("ls -l") call("ls", "-l") -> Popen(["ls", "-l"]) call(["ls", "-l"]) -> Popen(["ls", "-l"]) call(args="ls -l") -> Popen(args="ls -l") All it would mean is that if you want to use the optional arguments to Popen, you either don't use call, or you use keywords. Cheers, Nick. -- Nick Coghlan Brisbane, Australia

astrand@lysator.liu.se said:
They work with call if you use a string argument. That's the core of the problem: The callv function doesn't support passing a string-type args argument to the Popen constructor.
OK, I read the code again and I see what you mean. So yes, this argues even more against the existence of callv(). I would have expected that it would always be possible to translate a call() invocation to its callv() equivalent and vice-versa. That turns out not to be true in the case of call() users who want MSC-style command-line parsing done on a bare string (whether by CreateProcess() on windows or cmdline2list() on unix), because the _execute_child() implementations need to know whether args was originally a string or a list, and this information is hidden by callv()'s list encapsulation of the args. This all makes me think there could be a better approach to cross-platform handling of command-line arguments. When is anyone ever going to want the braindamaged MS cmdline expansion done while they're trying to invoke a subprocess on unix? I don't see that getting used a lot. I think I need to understand better the division of labor between parent and child on Windows when it comes to passing the command line during process creation. I'd like to think that unix execv() is a superset of the interface offered by CreateProcess(), in that you can initialize your child's argv however you like without regard to whitespace or quoting. So it would be best if it were possible to offer the execv() interface on both platforms if that's possible. Jason

Jason Lunz <lunz@falooley.org> writes:
I think I need to understand better the division of labor between parent and child on Windows when it comes to passing the command line during process creation.
It's simple. The parent passes a *command line*, and the child retrieves it using the GetCommandLine API. There is no concept of argv at the Windows level - it is implemented by the C runtime parsing the return value of GetCommandLine() and passing the resulting arguments to main(). Hence on Windows, a command line is the fundamental unit, whereas on Unix an argument list is fundamental. The biggest problem on Windows is that not all executables use the Microsoft C runtime. Some use other C runtimes, others parse the command line directly and don't use argv at all.
I'd like to think that unix execv() is a superset of the interface offered by CreateProcess(), in that you can initialize your child's argv however you like without regard to whitespace or quoting.
It would be nice to think that, but it ain't true <wink>.
So it would be best if it were possible to offer the execv() interface on both platforms if that's possible.
The unix execv is just *different*. Both the Windows and the Unix interfaces have capabilities the other doesn't offer. And as a result, it's not possible to offer an execv interface on Windows, at least not without running the risk of garbling some commands. Luckily, the oddball cases are rare. So 99% of the time, either interface will do OK (os.system on Unix, os.spawnv on Windows). But the 1% can be crucial - shell-exploit style security holes on Unix, garbled commands on Windows. I think Peter's approach of supporting both forms - a single string as a command line, and list of strings as an argv list, and converting both to the more natural OS-native form as needed, is sensible (I would, I argued for it when he was developing it!) Paul -- Ooh, how Gothic. Barring the milk.

pf_moore@yahoo.co.uk said:
Hence on Windows, a command line is the fundamental unit, whereas on Unix an argument list is fundamental.
Yes, you're right. I read up on CreateProcess(), GetCommandLine(), and CommandLineToArgvW() after posting. Interestingly, MS's sample code for CommandLineToArgvW is buggy because of confusion between the two interfaces. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/bas... Also, it can fail.
The biggest problem on Windows is that not all executables use the Microsoft C runtime. Some use other C runtimes, others parse the command line directly and don't use argv at all.
So why does subprocess use cmdline2list() in the parent on unix to emulate the way a child subprocess might parse the string on windows? (But only if it's written in C, uses the MSC runtime, and uses the argv/argc handed to main() rather than calling GetCommandLine() itself). Why not emulate CommandLineToArgvW()? or something else entirely? I think it would be cleaner not to emulate anything at all.
The unix execv is just *different*. Both the Windows and the Unix interfaces have capabilities the other doesn't offer.
Well, the windows interface is a subset of the unix one. The length of argv on windows is limited to 1.
I think Peter's approach of supporting both forms - a single string as a command line, and list of strings as an argv list, and converting both to the more natural OS-native form as needed, is sensible (I would, I argued for it when he was developing it!)
I can see that it's trying to be symmetric and orthogonal, but I don't think that the result is worth it in this case. In what scenario is the use of cmdline2list() really useful? Jason

On Sun, 10 Oct 2004, Jason Lunz wrote:
The biggest problem on Windows is that not all executables use the Microsoft C runtime. Some use other C runtimes, others parse the command line directly and don't use argv at all.
So why does subprocess use cmdline2list() in the parent on unix to emulate the way a child subprocess might parse the string on windows? (But only if it's written in C, uses the MSC runtime, and uses the argv/argc handed to main() rather than calling GetCommandLine() itself). Why not emulate CommandLineToArgvW()? or something else entirely? I think it would be cleaner not to emulate anything at all.
One goal with subprocess is being able to write cross-platform applications. For example, it should be possible to open up www.python.org in Mozilla. The best way to write this is: subprocess.call(["mozilla", "http://www.python.org"]) In this case, the list form is translated to the string form when running on Windows. Why allow the string form on UNIX? Answer: Symmetri, plus that some users that has been using os.system() for a long time thinks it's nice to be able to do: subprocess.call("ls -l /foo/bar") There's a risk that UNIX users might expect UNIX shell-like quoting support rather than the MSC one, though.
The unix execv is just *different*. Both the Windows and the Unix interfaces have capabilities the other doesn't offer.
Well, the windows interface is a subset of the unix one. The length of argv on windows is limited to 1.
True, if we are talking about the UNIX exec functions. When executing through a UNIX shell, the native interface is a string.
I think Peter's approach of supporting both forms - a single string as a command line, and list of strings as an argv list, and converting both to the more natural OS-native form as needed, is sensible (I would, I argued for it when he was developing it!)
I can see that it's trying to be symmetric and orthogonal, but I don't think that the result is worth it in this case. In what scenario is the use of cmdline2list() really useful?
I don't really have a good example. If we should remove the cmdline2list stuff, what should happen if the users passes a string on UNIX? Do you prefer: 1) Run through the shell (automatic shell=True). or 2) A ValueError raised. I guess alternative 1 is most intuitive. That would line up with popen2 as well. Anyone objecting to this change? /Peter Åstrand <astrand@lysator.liu.se>

astrand@lysator.liu.se said:
One goal with subprocess is being able to write cross-platform applications. For example, it should be possible to open up www.python.org in Mozilla. The best way to write this is:
subprocess.call(["mozilla", "http://www.python.org"])
In this case, the list form is translated to the string form when running on Windows.
understood. I personally have a definite need for this in my programs, so I know what's involved.
There's a risk that UNIX users might expect UNIX shell-like quoting support rather than the MSC one, though.
exactly. I just see it confusing people. If someone wants simple string handling on unix, then shell=True. problem solved. If they need the type of portability referred to with your mozilla example, then they should use the list form and let windows do list2cmdline() on it. The converse case of using a string for windows and using cmdline2list() on it for unix is a good try, but there's no guarantee that the actual windows subprocess will even do the MSVCRT-compatible-cmdline2list() conversion. Which leaves you in a very confusing situation indeed.
I can see that it's trying to be symmetric and orthogonal, but I don't think that the result is worth it in this case. In what scenario is the use of cmdline2list() really useful?
I don't really have a good example.
If we should remove the cmdline2list stuff, what should happen if the users passes a string on UNIX? Do you prefer:
1) Run through the shell (automatic shell=True). or 2) A ValueError raised.
I guess alternative 1 is most intuitive. That would line up with popen2 as well.
Use of the shell should be explicit, not automatic, because of the usual shell metacharacter security concerns. unix programmers used to doing os.system('ls -l') will quickly learn that the subprocess way of doing the same is subprocess.call('ls -l', shell=True). This has the added benifit of making it obvious exactly what's happening. I don't think that the only alternative to number 1) is to raise a ValueError. What do you think of the below patch? Just listify bare strings on unix. This does exactly what it should when the string actually references a binary, and gives a meaningful error when it doesn't. Even if the filename has strange characters of some kind. $ echo '#! /bin/sh' > 'ls -l' $ echo 'echo foo' >> 'ls -l' $ cat 'ls -l' #! /bin/sh echo foo $ python Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
from subprocess import call call('./ls -l') Traceback (most recent call last): File "<stdin>", line 1, in ? File "subprocess.py", line 441, in call return Popen(*args, **kwargs).wait() File "subprocess.py", line 524, in __init__ errread, errwrite) File "subprocess.py", line 942, in _execute_child raise child_exception OSError: [Errno 13] Permission denied
$ chmod +x 'ls -l' $ python Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
from subprocess import call call('./ls -l') foo 0 call('./ls -blah') Traceback (most recent call last): File "<stdin>", line 1, in ? File "subprocess.py", line 441, in call return Popen(*args, **kwargs).wait() File "subprocess.py", line 524, in __init__ errread, errwrite) File "subprocess.py", line 942, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
Nice, straightforward, easy to understand. And look how much code is removed -- I haven't even cleaned up the corresponding docs yet. Index: subprocess.py =================================================================== RCS file: /cvsroot/python-popen5/popen5/subprocess.py,v retrieving revision 1.15 diff -u -p -r1.15 subprocess.py --- subprocess.py 9 Oct 2004 10:11:06 -0000 1.15 +++ subprocess.py 11 Oct 2004 14:56:26 -0000 @@ -854,11 +854,11 @@ class Popen: errread, errwrite): """Execute program (POSIX version)""" + if isinstance(args, types.StringTypes): + args = [args] + if shell: args = ["/bin/sh", "-c"] + args - else: - if isinstance(args, types.StringTypes): - args = self.cmdline2list(args) if executable == None: executable = args[0] @@ -1051,93 +1051,6 @@ class Popen: self.wait() return (stdout, stderr) - - - def cmdline2list(cmdline): - """ - Translate a command line string into a list of arguments, using - using the same rules as the MS C runtime: - - 1) Arguments are delimited by white space, which is either a - space or a tab. - - 2) A string surrounded by double quotation marks is - interpreted as a single argument, regardless of white space - contained within. A quoted string can be embedded in an - argument. - - 3) A double quotation mark preceded by a backslash is - interpreted as a literal double quotation mark. - - 4) Backslashes are interpreted literally, unless they - immediately precede a double quotation mark. - - 5) If backslashes immediately precede a double quotation mark, - every pair of backslashes is interpreted as a literal - backslash. If the number of backslashes is odd, the last - backslash escapes the next double quotation mark as - described in rule 3. - """ - - # See - # http://msdn.microsoft.com/library/en-us/vccelng/htm/progs_12.asp - - # Step 1: Translate all literal quotes into QUOTE. Justify number - # of backspaces before quotes. - tokens = [] - bs_buf = "" - QUOTE = 1 # \", literal quote - for c in cmdline: - if c == '\\': - bs_buf += c - elif c == '"' and bs_buf: - # A quote preceded by some number of backslashes. - num_bs = len(bs_buf) - tokens.extend(["\\"] * (num_bs//2)) - bs_buf = "" - if num_bs % 2: - # Odd. Quote should be placed literally in array - tokens.append(QUOTE) - else: - # Even. This quote serves as a string delimiter - tokens.append('"') - - else: - # Normal character (or quote without any preceding - # backslashes) - if bs_buf: - # We have backspaces in buffer. Output these. - tokens.extend(list(bs_buf)) - bs_buf = "" - - tokens.append(c) - - # Step 2: split into arguments - result = [] # Array of strings - quoted = False - arg = [] # Current argument - tokens.append(" ") - for c in tokens: - if c == '"': - # Toggle quote status - quoted = not quoted - elif c == QUOTE: - arg.append('"') - elif c in (' ', '\t'): - if quoted: - arg.append(c) - else: - # End of argument. Output, if anything. - if arg: - result.append(''.join(arg)) - arg = [] - else: - # Normal character - arg.append(c) - - return result - - cmdline2list = staticmethod(cmdline2list) def list2cmdline(seq):

On Mon, 11 Oct 2004, Jason Lunz wrote:
If we should remove the cmdline2list stuff, what should happen if the users passes a string on UNIX? Do you prefer:
1) Run through the shell (automatic shell=True). or 2) A ValueError raised.
I guess alternative 1 is most intuitive. That would line up with popen2 as well.
Use of the shell should be explicit, not automatic, because of the usual shell metacharacter security concerns. unix programmers used to doing os.system('ls -l') will quickly learn that the subprocess way of doing the same is subprocess.call('ls -l', shell=True). This has the added benifit of making it obvious exactly what's happening.
Good points.
I don't think that the only alternative to number 1) is to raise a ValueError.
What do you think of the below patch? Just listify bare strings on unix. This does exactly what it should when the string actually references a binary, and gives a meaningful error when it doesn't. Even if the filename has strange characters of some kind.
I like it. I've submitted your patch (but with documentation updates as well). /Peter Åstrand <astrand@lysator.liu.se>

astrand@lysator.liu.se said:
I'd prefer not to rename the call() function. The name is short and simple, and the function is very much used.
That's understandable. Though if people are going to go through the pain of changing it, it's better that it happen before it becomes a standard part of python.
I'm positive to renaming the callv() function, though. One obvious name would be "calll", but that's quite ugly. How about "lcall"? Then we can keep the "callv" name for backwards compatibility.
How recently was callv added? I'd prefer not to have a callv at all than to have a call/callv pair that don't map naturally to execl/execv.
Or, we could just keep the "callv" name, and pretend that "v" stands for "variable number of arguments".
I really don't want to do this. I can tell already I'll be forever forgetting which one I need, and probably anyone else with C/unix experience will be in the same boat. It's the kind of irritant I'd like to wipe out now while there's still the opportunity. Jason

On Sat, 9 Oct 2004, Jason Lunz wrote:
I'm positive to renaming the callv() function, though. One obvious name would be "calll", but that's quite ugly. How about "lcall"? Then we can keep the "callv" name for backwards compatibility.
How recently was callv added? I'd prefer not to have a callv at all than to have a call/callv pair that don't map naturally to execl/execv.
callv has been around even longer than call actually, although callv was earlier called "run".
Or, we could just keep the "callv" name, and pretend that "v" stands for "variable number of arguments".
I really don't want to do this. I can tell already I'll be forever forgetting which one I need, and probably anyone else with C/unix experience will be in the same boat. It's the kind of irritant I'd like to wipe out now while there's still the opportunity.
I don't have a very strong opinion about callv, so if the general opinion wants to remove it, that's OK with me. /Peter Åstrand <astrand@lysator.liu.se>

Haven't really read the PEP/tried the module, so I can't comment on it specifically, but +1 on a proper subprocess module. Implemented something like this myself some time ago. Peter Astrand wrote:
Am I missing something? Can these be renamed now before it gets standardized?
I'd prefer not to rename the call() function. The name is short and simple, and the function is very much used. I'm positive to renaming the callv() function, though. One obvious name would be "calll", but that's quite ugly. How about "lcall"? Then we can keep the "callv" name for backwards compatibility.
Don't think backwards compatibility is that much of an issue. Since you're renaming it subprocess (+1 on the name) old code will have to be updated anyway. -1 on function names conflicting with the exec/spawn way of naming things. Erik

Peter Astrand wrote:
I've recieved very positive feedback on my module. Many users are also asking me if this module will be included in the standard library. That is, of course, my wish as well.
So, can the subprocess module be accepted? If not, what needs to be done?
just check it in, already. (can os.popen and popen2 be refactored to use subprocess, btw?) </F>

I've recieved very positive feedback on my module. Many users are also asking me if this module will be included in the standard library. That is, of course, my wish as well.
So, can the subprocess module be accepted? If not, what needs to be done?
FWIW, I've given Peter some feedback, but in general I think this code is ready for inclusion into 2.4. I hope it's not too late. (My recommendation on the call/callv issue is to get rid of callv(), and instead let call() interpret multiple string args as such; a single string arg will be interpreted as a whole command line, per Popen(); if you want a single file arg you have to enclose it in [...].) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

[snip Guido's mention of subprocess being included in Python 2.4, if at all possible] Speaking of inclusion, we seemed to have come upon a stalemate with the 'gG' format code additions. Some users (without commit access to Python CVS) have stated they would use it (and currently write their own packing and unpacking functions), but some developers (with commit access to Python CVS) have stated they would never use it. At that point, the discussion pretty much ended. I know that the request is coming in a bit late in the 2.4 development cycle, but waiting until Python 2.5 seems like a bit much, considering that the functionality is so simple it can be proven correct quite easily (if it is to be included). A pronouncement would be nice, but I understand that I am in no position to ask/beg for a (yes, we'll add it) response. <.5 wink> Thank you, - Josiah P.S. Note that everyone involved believes that binascii.from/to_long should be added (I think it was even pronounced by Guido).

[snip Guido's mention of subprocess being included in Python 2.4, if at all possible]
Speaking of inclusion, we seemed to have come upon a stalemate with the 'gG' format code additions. Some users (without commit access to Python CVS) have stated they would use it (and currently write their own packing and unpacking functions), but some developers (with commit access to Python CVS) have stated they would never use it.
At that point, the discussion pretty much ended.
I know that the request is coming in a bit late in the 2.4 development cycle, but waiting until Python 2.5 seems like a bit much, considering that the functionality is so simple it can be proven correct quite easily (if it is to be included).
A pronouncement would be nice, but I understand that I am in no position to ask/beg for a (yes, we'll add it) response. <.5 wink>
Thank you, - Josiah
P.S. Note that everyone involved believes that binascii.from/to_long should be added (I think it was even pronounced by Guido).
I'm in favor of adding the latter only. Is there a patch? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

[Josiah Carlson]
Note that everyone involved believes that binascii.from/to_long should be added (I think it was even pronounced by Guido).
[Guido]
I'm in favor of adding the latter only. Is there a patch?
No. It would be easy enough, but nobody had time for it. AFAICT, not even the signatures for such functions have been fully worked out yet. Since it would be a small and self-contained addition, I wouldn't worry about adding it after the beta. I wouldn't worry much more about adding g/G after either, for that matter.

Josiah Carlson wrote:
A pronouncement would be nice, but I understand that I am in no position to ask/beg for a (yes, we'll add it) response. <.5 wink>
Well, I can declare, in public :-), that I personally won't add that feature to Python 2.4. There are so many other things to do, and I don't consider this feature (in whatever form) important, as this can be easily done in pure Python for those who need it. Regards, Martin

Martin v. Löwis wrote:
Josiah Carlson wrote:
A pronouncement would be nice, but I understand that I am in no position to ask/beg for a (yes, we'll add it) response. <.5 wink>
Well, I can declare, in public :-), that I personally won't add that feature to Python 2.4.
Scratch chin. Then I suppose it is a good thing that you aren't the only person with commit access then <full wink and a nose wiggle>.
There are so many other things to do, and I don't consider this feature (in whatever form) important, as this can be easily done in pure Python for those who need it.
The same thing can be said for some syntactic changes to Python over the years; yet that didn't stop list comprehensions, rich comparisons, iterators (generators being the language change that made them easier), etc., from changing the language. I'm not even asking to change the language, I'm offering a patch that implements a functionality desireable to a group of users by making a module more flexible. Since Tim has said that the change is so minor as to be all right to wait until after the beta, I'll keep out of everyone's hair until then. - Josiah

Josiah Carlson wrote:
Since Tim has said that the change is so minor as to be all right to wait until after the beta, I'll keep out of everyone's hair until then.
I guess this _is_ pretty minor, but please don't assume that this means I'll be letting all sorts of new things in past b1 (at least, not without a lot of complaining on my part ;) Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.

Josiah Carlson wrote:
There are so many other things to do, and I don't consider this feature (in whatever form) important, as this can be easily done in pure Python for those who need it.
The same thing can be said for some syntactic changes to Python over the years; yet that didn't stop list comprehensions, rich comparisons, iterators (generators being the language change that made them easier), etc., from changing the language.
Indeed. I wish some of the changes would not have been made (although the changes on your list are all fine). For those features, the potential user community is *much* larger than for the feature under discussion, and I feel I would waste my time for adding a feature that only few users ever need. If you would like to debate the size of the user community: any significanly-large user community should have come up with a standard, pure-Python solution to the problem by now which would just wait for integration. This is IMO the process that should be followed for all library extensions: the library should be developed elsewhere, and wait some time for API and implementation stabilization. Only *then* it become candidate for inclusion into Python. Regards, Martin

For those features, the potential user community is *much* larger than for the feature under discussion, and I feel I would waste my time for adding a feature that only few users ever need.
And yet you continue the discussion? I talk about it because /I/ find the functionality useful, I believe /others/ would find it useful if they were put in similar situations as I, and I believe it adds flexibility to a module (which I think is positive).
If you would like to debate the size of the user community: any significanly-large user community should have come up with a standard, pure-Python solution to the problem by now which would just wait for integration. This is IMO the process that should be followed for all library extensions: the library should be developed elsewhere, and wait some time for API and implementation stabilization. Only *then* it become candidate for inclusion into Python.
If every modification to Python required a community of people, who all joined together to advocate something, then nothing would ever get done. People would be spending all their time trying to gather groups of people to agree with them that functionality K is necessary. Thankfully there has been cases in the past where someone has said, "hey, this thing could really use X", some people agree, some people disagree, a sample implementation is done, and sometimes it is accepted. I'm not even saying that we should add a new module. I'm not even saying we should add a new function to a module. Heck, I'm not even asking for a new argument to a function that previously existed. I am, quite literally, asking for the equivalent of less than one bit in a flag (there are currently 22 format/endian characters, which are all orthogonal, which would require 5 bits, if they were flags). The API already exists. The framework already exists. I'm not asking for Python to interpret something that was valid before as something different now. I'm asking for the equivalent of a previously invalid flag, to become valid, in order to expose previously existing translation mechanisms, whose use can be found in databases, network protocols, encryption, etc. Try to read the above paragraph outside of the context of struct and the RFE. Does it make sense to include the change now? If every request to interpret a new flag required significant community involvement, goodness, would it take an act of Guido to get a commit done? Have a good day Martin, - Josiah

[Josiah]
I'm asking for the equivalent of a previously invalid flag, to become valid, ...
IMO, there was an early transition from asking to demanding. Negative comments by some of the top developers did not dissuade you in the least.
If every request to interpret a new flag required significant community involvement, goodness, would it take an act of Guido to get a commit done?
Yes, if a proposal doesn't win developer support, then having Guido as a champion is the best bet. If you are opinion shopping, then the best so far is Tim's comment which seems to equate to a +0. That would be sufficient only if he cares enough to review the patch, write the docs, test it, maintain it, etc. Perhaps he will, perhaps he won't. FWIW, last year I had dropped one of my own proposals (relating to a non-unicode use for encode/decode) simply based on respect for Martin's -1. For me, that was more important than the +0 to +1 comments from others, more important than my own use cases, and more important than feeling like I was right. In the end, there was an easy pure python equivalent and life went on. Looking at the struct feature request, I think it would be harmless to add it. OTOH, it is easily handled in python and will be easier still when the binascii functions get put in. Arguably, the feature is somewhat trivial and won't change anyone's life for good or ill. So, you have to ask yourself whether it is worth jamming down everyone's throats to get it in after the beta goes out. For my money, the feature request has already consumed more developer resources than it could ever save. Raymond

Looking at the struct feature request, I think it would be harmless to add it. OTOH, it is easily handled in python and will be easier still when the binascii functions get put in. Arguably, the feature is somewhat trivial and won't change anyone's life for good or ill.
It makes processing one's data a 1-pass affair rather than a 2-pass affair. Not a big deal for most, but it gets me about a 20% speedup on a few formats, and saves me from writing custom translations every time I want some nonstandard sized integer.
So, you have to ask yourself whether it is worth jamming down everyone's throats to get it in after the beta goes out. For my money, the feature request has already consumed more developer resources than it could ever save.
Good points Raymond. If someone decides they want to add it, great, I'll even write some documentation and field sf.net bug reports and such in regards to the struct module. If not, I'll just use binascii and bite my tongue. My apologies for raising a stink. - Josiah

Josiah Carlson wrote:
For those features, the potential user community is *much* larger than for the feature under discussion, and I feel I would waste my time for adding a feature that only few users ever need.
And yet you continue the discussion?
This tends to get off-topic, but I typically respond when I'm asked a question. As you keep asking questions, I keep responding :-)
If every modification to Python required a community of people, who all joined together to advocate something, then nothing would ever get done.
No. Modifications to fix bugs don't need debating. Only new features do. I do wish that people would focus more on fixing bugs than on introducing new features.
People would be spending all their time trying to gather groups of people to agree with them that functionality K is necessary.
And that would be a good thing, atleast in the specific case. Even if it is undebated that the feature (convert numbers into odd-sized byte strings) is desirable, the specific API needs careful consideration, since it cannot be changed easily after it has been added. So all these new features are a serious legacy if they later turn out to be ill-designed.
I'm not even saying that we should add a new module. I'm not even saying we should add a new function to a module. Heck, I'm not even asking for a new argument to a function that previously existed. I am, quite literally, asking for the equivalent of less than one bit in a flag (there are currently 22 format/endian characters, which are all orthogonal, which would require 5 bits, if they were flags).
Correct. Yet I would feel more comfortable if you had proposed a new module, or a new function to an existing module. The former would allow distribution independent of Python, and the latter would not put a burden on users of the existing functions.
The API already exists. The framework already exists. I'm not asking for Python to interpret something that was valid before as something different now.
Yes. The extension you propose probably does not cause backward compatibility problems. Strictly speaking, there are programs that break under this extension, but this is the case for any extension, and such programs likely don't occur in real life.
Try to read the above paragraph outside of the context of struct and the RFE. Does it make sense to include the change now?
No. For such a change, we need to study whether there alternative APIs which achieve the same effect (which there are), and whether the new flag puts an additional learning burden on users of the existing API (which it does).
If every request to interpret a new flag required significant community involvement, goodness, would it take an act of Guido to get a commit done?
No, I have committed new features myself in the past. See the CVS log for details. It is this specific change I'm opposed to, at this point in time. Regards, Martin

People would be spending all their time trying to gather groups of people to agree with them that functionality K is necessary.
And that would be a good thing, atleast in the specific case. Even if it is undebated that the feature (convert numbers into odd-sized byte strings) is desirable, the specific API needs careful consideration, since it cannot be changed easily after it has been added. So all these new features are a serious legacy if they later turn out to be ill-designed.
If it turns out that the feature is ill-designed, then there is a standard method of dealing with it: 1. Declare that it will be Deprecated in the future. 2. Start raising a DeprecationWarning for at least one major version of Python. 3. Remove the functionality In the case of struct and this addition, if it is the case that this modification is catastrophic to Python (very unlikely), I would hope that someone would have said, "hey, this breaks applications A, B, C, D, ..., and it inclusion was the biggest mistake Python has ever made, even moreso than anyone ever responding to Josiah* in the first place." * Me trying to be funny.
I'm not even saying that we should add a new module. I'm not even saying we should add a new function to a module. Heck, I'm not even asking for a new argument to a function that previously existed. I am, quite literally, asking for the equivalent of less than one bit in a flag (there are currently 22 format/endian characters, which are all orthogonal, which would require 5 bits, if they were flags).
Correct. Yet I would feel more comfortable if you had proposed a new module, or a new function to an existing module. The former would allow distribution independent of Python, and the latter would not put a burden on users of the existing functions.
For those who use struct as it is now, the proposed additional format codes raise a struct.error exception in older versions of Python. If people are relying on pack/unpack with characters 'g' or 'G' raising a struct.error exception, then something is definitely wrong with the way they have been using struct. If people want to use the modifications with a previous version of Python, the changes are easily backported (I think one can even take structmodule.c from CVS and compile it with an older source tree).
Yes. The extension you propose probably does not cause backward compatibility problems. Strictly speaking, there are programs that break under this extension, but this is the case for any extension, and such programs likely don't occur in real life.
Agreed.
No. For such a change, we need to study whether there alternative APIs which achieve the same effect (which there are), and whether the new flag puts an additional learning burden on users of the existing API (which it does).
Tim Peters' alternative long2bytes and bytes2long looked quite usable, and indeed produces strikingly similar results, for single item packing and unpacking scenarios. The learning curve of struct was claimed to be steep in the case of native byte alignments, or for those not familliar with C types or structs. Since these codes are not native to any platform, it doesn't make the native byte alignment stuff any more difficult. If people are having trouble understanding the concept of C structs, I don't believe that two new format characters are going to be significantly more difficult to swallow than the 17 others that already exist. Especially when the standard C type codes 'cbhilqspBHILQ' are likely sufficient for their needs. In the case of them not being sufficient, well hey, they would have another two options that may do it for them.
If every request to interpret a new flag required significant community involvement, goodness, would it take an act of Guido to get a commit done?
No, I have committed new features myself in the past. See the CVS log for details. It is this specific change I'm opposed to, at this point in time.
I was trying to be funny, to perhaps lighten the mood. Hence "act of Guido" rather than "act of God" (where the latter is a cliche). - Josiah

On Mon, 11 Oct 2004, Guido van Rossum wrote:
So, can the subprocess module be accepted? If not, what needs to be done?
FWIW, I've given Peter some feedback, but in general I think this code is ready for inclusion into 2.4. I hope it's not too late.
Sounds great.
(My recommendation on the call/callv issue is to get rid of callv(),
callv() has now been removed.
and instead let call() interpret multiple string args as such; a single string arg will be interpreted as a whole command line, per Popen(); if you want a single file arg you have to enclose it in [...].)
I haven't done any changes to call(): it passes the arguments directly to Popen(), just like before. I would prefer if we could keep it that way; it's so nice to have a short and simple description of the function: Run command with arguments. Wait for command to complete, then return the returncode attribute. If we should do any more changes to handling of multiple args/strings/sequences, we should probably do it to the Popen class itself. But I think we have found a good API now. Here's the documentation: " args should be a string, or a sequence of program arguments. The program to execute is normally the first item in the args sequence or string, but can be explicitly set by using the executable argument. On UNIX, with shell=False (default): In this case, the Popen class uses os.execvp() to execute the child program. args should normally be a sequence. A string will be treated as a sequence with the string as the only item (the program to execute). On UNIX, with shell=True: If args is a string, it specifies the command string to execute through the shell. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional shell arguments. On Windows: the Popen class uses CreateProcess() to execute the child program, which operates on strings. If args is a sequence, it will be converted to a string using the list2cmdline method. Please note that not all MS Windows applications interpret the command line the same way: The list2cmdline is designed for applications using the same rules as the MS C runtime. " /Peter Åstrand <astrand@lysator.liu.se>

Quick thought: The subprocess class is still called Popen. Any reason not to drop this last reference to the unix clib function and rename it Subprocess? Erik

On Wed, 13 Oct 2004, Erik Heneryd wrote:
Quick thought: The subprocess class is still called Popen. Any reason not to drop this last reference to the unix clib function and rename it Subprocess?
It's shorter... :) Also, I think it's good that the name hints about "process open"; that you can read and write to the process. /Peter Åstrand <astrand@lysator.liu.se>

Guido van Rossum wrote:
FWIW, I've given Peter some feedback, but in general I think this code is ready for inclusion into 2.4. I hope it's not too late. (My recommendation on the call/callv issue is to get rid of callv(), and instead let call() interpret multiple string args as such; a single string arg will be interpreted as a whole command line, per Popen(); if you want a single file arg you have to enclose it in [...].)
2.4b1 is planned for this Thursday. I'm taking Wednesday off work to commit a backlog of patches I want to see in, then plan to cut the release Thursday. I can probably move this to Friday, if it would help this get in. I _really_ don't want this landing after b1, tho - that kinda makes the entire release process pointless if we're landing stuff that late. I'm off to sleep now - could Peter or whoever else is likely to do the work let me know? If it's just a matter of doing the checkin, and making sure all the fiddly bits are done, you could also point me at a patch and I'll do it.

Anthony Baxter wrote
I'm off to sleep now - could Peter or whoever else is likely to do the work let me know? If it's just a matter of doing the checkin, and making sure all the fiddly bits are done, you could also point me at a patch and I'll do it.
I can check in the source files; someone with a proper Windows setup will have to help us out with the makefile and installer issues. Stay tuned. </F>

Anthony Baxter wrote:
I'm off to sleep now - could Peter or whoever else is likely to do the work let me know? If it's just a matter of doing the checkin, and making sure all the fiddly bits are done, you could also point me at a patch and I'll do it.
I'll check in the source files shortly. Someone with a proper Windows setup will have to help out with makefile and installer tweaks for the PC/_subprocess.c module; drop me a line if you need info. <F>

Fredrik Lundh wrote:
I'll check in the source files shortly. Someone with a proper Windows setup will have to help out with makefile and installer tweaks for the PC/_subprocess.c module; drop me a line if you need info.
Tim fixed the windows issues (and got rid of the banana), but I just realized that we could need some help with converting the documentation from PEP/docstring format to latex... Any volunteers? (the PEP in nondist/peps/pep-0324.txt contains extensive documentation, but it can also be good to check the Lib/subprocess.py docstring for additional examples) (Now, if only someone could convert the docs to "html body + semantic classes" so people with ordinary skills can work on it, without having to learn any new syntaxes or install any new tools, besides python and a web browser...) </F>

can bug categories be associated with developers? if so, would it perhaps make sense to add Peter as a developer, and add a category for the subprocess system? </F>

[Fredrik Lundh]
can bug categories be associated with developers?
Yes. For example, I see that the "Regular Expressions" category is auto-assigned to effbot.
if so, would it perhaps make sense to add Peter as a developer,
That would be fine by me, provided he asks for it (it's a commitment, despite that developers routinely vanish for years at a time <wink>).
and add a category for the subprocess system?
That one's harder to call. Once a category is added, it can never be removed. So I'd wait on that to see whether enough reports came in against subprocess to justify it. Since it's unlikely that an original reporter will assign "the right" category to begin with, and the original submission is the only time at which auto-assignment has an effect, the major value in all this would be achieved simply by having a subprocess expert with commit privileges.

On Wed, 13 Oct 2004, Tim Peters wrote:
if so, would it perhaps make sense to add Peter as a developer,
That would be fine by me, provided he asks for it (it's a commitment, despite that developers routinely vanish for years at a time <wink>).
I'm planning to be around for some time :) Developer access would be good, if I should maintain the subprocess module. (My SF account is "astrand".)
and add a category for the subprocess system?
That one's harder to call. Once a category is added, it can never be removed. So I'd wait on that to see whether enough reports came in against subprocess to justify it. Since it's unlikely that an original reporter will assign "the right" category to begin with, and the original submission is the only time at which auto-assignment has an effect, the major value in all this would be achieved simply by having a subprocess expert with commit privileges.
Sounds good to me. /Peter Åstrand <astrand@lysator.liu.se>

[/F]
if so, would it perhaps make sense to add Peter as a developer,
[Uncle Timmy]
That would be fine by me, provided he asks for it (it's a commitment, despite that developers routinely vanish for years at a time <wink>).
[Peter Astrand]
I'm planning to be around for some time :) Developer access would be good, if I should maintain the subprocess module. (My SF account is "astrand".)
If there are no objections before I wake up again, and nobody else has done it, I'll add Peter as a Python developer.

[/F]
if so, would it perhaps make sense to add Peter as a developer,
[Uncle Timmy]
That would be fine by me, provided he asks for it (it's a commitment, despite that developers routinely vanish for years at a time <wink>).
[Peter Astrand]
I'm planning to be around for some time :) Developer access would be good, if I should maintain the subprocess module. (My SF account is "astrand".)
If there are no objections before I wake up again, and nobody else has done it, I'll add Peter as a Python developer.
Peter, your permissions are enabled. Welcome. Raymond

Raymond Hettinger wrote:
If there are no objections before I wake up again, and nobody else has done it, I'll add Peter as a Python developer.
Peter, your permissions are enabled. Welcome.
peter's user (astrand) isn't included in the "Assigned to" list in the bug tracker. can this perhaps be fixed? </F>

On Wednesday 20 October 2004 07:17 am, Fredrik Lundh wrote:
peter's user (astrand) isn't included in the "Assigned to" list in the bug tracker. can this perhaps be fixed?
This should be fine now. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org>
participants (14)
-
"Martin v. Löwis"
-
Anthony Baxter
-
Erik Heneryd
-
Fred L. Drake, Jr.
-
Fredrik Lundh
-
Guido van Rossum
-
Jason Lunz
-
Josiah Carlson
-
Nick Coghlan
-
Paul Moore
-
Peter Astrand
-
Raymond Hettinger
-
Thomas Heller
-
Tim Peters