[Python-bugs-list] [ python-Bugs-436259 ] [Windows] exec*/spawn* problem with spaces in args
noreply@sourceforge.net
noreply@sourceforge.net
Wed, 11 Jul 2001 21:30:24 -0700
Bugs item #436259, was opened at 2001-06-25 20:17
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=436259&group_id=5470
Category: Distutils
Group: Platform-specific
Status: Open
Resolution: None
Priority: 5
Submitted By: Ben Hutchings (wom-work)
Assigned to: Greg Ward (gward)
Summary: [Windows] exec*/spawn* problem with spaces in args
Initial Comment:
DOS and Windows processes are not given an argument
vector, as Unix processes are; instead they are given a
command line and are expected to perform any necessary
argument parsing themselves. Each C run-time library
must convert command lines into argument vectors for
the main() function, and if it includes exec* and
spawn* functions then those must convert argument
vectors into a command-line. Naturally, the various
implementations differ in interesting ways.
The Visual C++ run-time library (MSVCRT) implementation
of the exec* and spawn* functions is particularly awful
in that it simply concatenates the strings with spaces
in-between (see source file cenvarg.c), which means
that arguments with embedded spaces are likely to turn
into multiple arguments in the new process. Obviously,
when Python is built using Visual C++, its os.exec* and
os.spawn* functions behave in this way too. MS prefers
to work around this bug (see Knowledge Base article
Q145937) rather than to fix it. Therefore I think
Python must work around it too when built with Visual C++.
I experimented with MSVCRT and Cygwin (using the
attached program print_args.c) and could not find a way
to convert an argument vector into a command line that
they would both convert back to the same argument
vector, but I got close.
MSVCRT's parser requires spaces that are part of an
argument to be enclosed in double-quotes. The
double-quotes do not have to enclose the whole
argument. Literal double-quotes must be escaped by
preceding them with a backslash. If an argument
contains literal backslashes before a literal or
delimiting double-quote, those backslashes must be
escaped by doubling them. If there is an unmatched
enclosing double-quote then the parser behaves as if
there was another double-quote at the end of the line.
Cygwin's parser requires spaces that are part of an
argument to be enclosed in double-quotes. The
double-quotes do not have to enclose the whole
argument. Literal double-quotes may be escaped by
preceding them with a backslash, but then they count as
enclosing double-quote as well, which appears to be a
bug. They may also be escaped by doubling them, in
which case they must be enclosed in double-quotes;
since MSVCRT does not accept this, it's useless. As far
as I can see, literal backslashes before a literal
double-quote must not be escaped and literal
backslashes before an enclosing double-quote *cannot*
be escaped. It's really quite hard to understand what
its rules are for backslashes and double-quotes, and I
think it's broken. If there is an unmatched enclosing
double-quote then the parser behaves as if there was
another double-quote at the end of the line.
Here's a Python version of a partial fix for use in
nt.exec* and nt.spawn*. This function modifies
argument strings so that the resulting command line
will satisfy programs that use MSVCRT, and programs
that use Cygwin if that's possible.
def escape(arg):
import re
# If arg contains no space or double-quote then
# no escaping is needed.
if not re.search(r'[ "]', arg):
return arg
# Otherwise the argument must be quoted and all
# double-quotes, preceding backslashes, and
# trailing backslashes, must be escaped.
def repl(match):
if match.group(2):
return match.group(1) * 2 + '\"'
else:
return match.group(1) * 2
return '"' + re.sub(r'(\*)("|$)', repl, arg) + '"'
This could perhaps be used as a workaround for the
problem. Unfortunately it would conflict with
workarounds implemented at the Python level (which I
have been using for a while).
----------------------------------------------------------------------
>Comment By: Ben Hutchings (wom-work)
Date: 2001-07-11 21:30
Message:
Logged In: YES
user_id=203860
"Note that processes using WinMain can get at argc and argv
under MSVC via including stdlib.h and using __argc and
__argv instead."
This is irrelevant. The OS passes the command line into a
process as a single string, which it makes accessible
through the GetCommandLine() function. The argument vector
received by main() or accessible as __argv is generated
from this by the C run-time library.
"The right way to address this is to add more smarts to
spawn.py in distutils"
I disagree. The right thing to do is to make these
functions behave in the same way across platforms, as far
as possible. Perhaps this could be done in two stages - in
the first release, make the fix optional, and in the
second, use it all the time.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-07-11 19:32
Message:
Logged In: YES
user_id=31435
Note that processes using WinMain can get at argc and argv
under MSVC via including stdlib.h and using __argc and
__argv instead.
I agree the space behavior sucks regardless. However, as
you've discovered, there's nothing magical we can do about
it without breaking the workarounds people have already
developed on their own -- including distutils.
The right way to address this is to add more smarts to
spawn.py in distutils, then press to adopt that in the std
library (distutils already does *some* magical arg quoting
on win32 systems, and could use your help to do a better
job of it).
Accordingly, I added [Windows] to the summary line, changed
the category to distutils, and reassigned to Greg Ward for
consideration.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=436259&group_id=5470