ANN: 0.1

Hartmut Goebel hartmut.goebel at
Wed Sep 11 12:08:09 EDT 2002


                          Version 0.1

Parse a string into words like a POSIX shell does.

Why this module?

Out in the wild there are quite a few modules for executing commands
in a sub-process. Most of them take a string (the command-line) as
input and use os.exec* for executing the program. This requires
splitting the string into words _exacly_like_the_shell_ does.

If this parsing/splitting is incorrect you can have quite some funny
time debugging. ;-)

Since I didn't find found any other module that-like, I decided to
develop this module.

Benefits using this module

Using this module has the following benefits:

- Enables your application or module to mimic word splitting like a
  POSIX shell without effort.

- Saves yourself debugginf-time then doing word-splitting.

- Avoids confusin the users of you application/module when splitting
  shell command lines into words, since this module behaves _exactly_
  like a POSIX shell does.

- The Unittest-Suite proves the correct word-splitting. Currently 75
  command lines are used, each testing a special pattern. The input
  data for this test-suite consists of command-lines which are split
  ba the shell on-fly fly. You can add your own test-patterns without
  any hassle.


This module parses a string into words according to the parings-rules
of a POSIX shell. These parsing rules are (quoted after 'man bash'):

1) Words are split at whitespace charakters; these are Space, Tab,
   Newline, Carriage-Return, Vertival-Tab (0B) and Form-Feet (0C).

   NB: Quotes do _not_ separate words! Thus
   will be parsed into a single word:

2) A non-quoted backslash (\) is the escape character. It preserves
   the literal value of the next character that follows.

3) Enclosing characters in single quotes preserves the literal value
   of each character within the quotes. A single quote may not occur
   between single quotes, even when preceded by a backslash.

   This means: baskslash (\) has no special meaning within single
   quotes. All charakters within single quotes are taken as-is.

4) Enclosing characters in double quotes preserves the literal value
   of all characters within the quotes, with the exception of \. The
   backslash retains its special meaning only when followed " or \. A
   double quote may be quoted within double quotes by preceding it
   with a backslash.


'shellwords' is available for download at


Requires Python >= 2.0

Frequently Asked Questions

Q: Hey, there is 'shlex' coming with Python. Why there is a need for
   this module?

A: I know 'shlex' and I gave it a try. But 'shlex' takes quotes as
   word-delemiters which divers from the shell-semantic (see above).
   And even if 'shlex' would parse strings as needed, I would have
   written a (very, very) thin layer above, since 'shlex' is simple
   but seldomly used for this kind of job.


(C) Copyright 2002 by Hartmut Goebel <h.goebel at>

License: Python Software Foundation License

More information about the Python-list mailing list