[issue1521950] shlex.split() does not tokenize like the shell

Dan Christian report at bugs.python.org
Fri Sep 3 20:42:28 CEST 2010


Dan Christian <robodan at users.sourceforge.net> added the comment:

It's been a while since I looked at this.  I'm not really in a
position to contribute code/tests right now; but I can comment.

I don't think POSIX mode existed when I first reported this, but
that's where it makes sense.  I think all POSIX shells (borne, C,
korne), will behave the same way for the issues mentioned.

There are really two cases in one bug.

The first part is that the shell will split tokens at characters that
shlex doesn't.  The handling of &, |, ;, >, and < could be done by
adjusting the definition of shlex.wordchars.  The shell may also
understands things like: &&, ||, |&, and >&.  The exact definition of
these depends on the shell, so maybe it's best to just split them out
as separate tokens and let the user figure out the compound meanings.

The proper handling of quotes/escapes requires some kind of new
interface.  You need to distinguish between tokens that were modified
by the quote/escape rules and those that were not.  One suggestion is
to add a new method as such:

shlex.get_token2()
   Return a tuple of the token and the original text of the token
(including quotes and escapes).  Otherwise, this is the same as
shlex.get_token().

Comparing the two values for equality (or maybe identity) would tell
you if something special was going on.  You can always pass the second
value to a reconstructed command line without losing any of the
original parsing information.

-Dan

On Fri, Sep 3, 2010 at 10:27 AM, Éric Araujo <report at bugs.python.org> wrote:
>
> Éric Araujo <merwok at netwok.org> added the comment:
>
> Thanks for the report. Would you like to work on a patch, or translate your examples into unit tests?
>
> The docs do not mention “&” at all, and platform discrepancies have to be taken into account too, so I really don’t know if this is a bug fix for the normal mode, the POSIX mode, or a feature request requiring a new argument to the shlex function to preserve compatibility.
>
> ----------
> nosy: +eric.araujo, eric.smith
>
> _______________________________________
> Python tracker <report at bugs.python.org>
> <http://bugs.python.org/issue1521950>
> _______________________________________
>

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1521950>
_______________________________________


More information about the Python-bugs-list mailing list