[issue1521950] shlex.split() does not tokenize like the shell
Dan Christian
report at bugs.python.org
Sat Nov 26 15:40:24 CET 2011
Dan Christian <robodan at users.sourceforge.net> added the comment:
On Sat, Nov 26, 2011 at 7:12 AM, Éric Araujo <report at bugs.python.org> wrote:
> Your script passes with dash, which is probably the most POSIX-compliant shell we can find. (bash has extensions, zsh/csh don’t use the POSIX shell language, so I think the behavior of dash should be our reference, not the bash man page.)
I was just looking for a reference where I didn't have to sift through
tons of documentation. Most systems have bash. Before that I was
just working from experience (I've done a lot of shell scripting).
> there is code out there that depends on the current behavior of shlex and does not need to support && || ; ( ), if we add support for these tokens we should not break the existing code.
Here's a thought on how that might work (just brainstorming). shlex
uses a series of character strings to drive it's parsing: whitespace,
escape, quotes. Add another one: control = '();<>|&'. If it is unset
(by default?), then the behavior is as before. If it is set, then
shlex will output any character in control as a separate token.
There might be a shell specific script (or maybe it's left to the
user) that decides that certain tokens can be recombined: '&&', '||',
'|&', '>>', etc. This code is pretty simple: walk the token
sequence, if you see a two token pair, pop the second and combine it
into the first.
-Dan
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1521950>
_______________________________________
More information about the Python-bugs-list
mailing list