escaping characters in filenames
J Kenneth King
james at agentultra.com
Wed Jul 29 22:33:11 CEST 2009
Nobody <nobody at nowhere.com> writes:
> On Wed, 29 Jul 2009 09:29:55 -0400, J Kenneth King wrote:
>> I wrote a script to process some files using another program. One thing
>> I noticed was that both os.listdir() and os.path.walk() will return
>> unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\
>> File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a
>> module or recipe that escapes file names and was wondering if anyone
>> could point me in the right direction.
>> As an aside, the script is using subprocess.call() with the "shell=True"
>> parameter. There isn't really a reason for doing it this way (was just
>> the fastest way to write it and get a prototype working). I was
>> wondering if Popen objects were sensitive to unescaped names like the
>> shell. I intend to refactor the function to use Popen objects at some
>> point and thought perhaps escaping file names may not be entirely
> Note that subprocess.call() is nothing more than:
> def call(*popenargs, **kwargs):
> return Popen(*popenargs, **kwargs).wait()
> plus a docstring. It accepts exactly the same arguments as Popen(), with
> the same semantics.
> If you want to run a command given a program and arguments, you
> should pass the command and arguments as a list, rather than trying to
> construct a string.
> On Windows the value of shell= is unrelated to whether the command is
> a list or a string; a list is always converted to string using the
> list2cmdline() function. Using shell=True simply prepends "cmd.exe /c " to
> the string (this allows you to omit the .exe/.bat/etc extension for
> extensions which are in %PATHEXT%).
> On Unix, a string is first converted to a single-element list, so if you
> use a string with shell=False, it will be treated as the name of an
> executable to be run without arguments, even if contains spaces, shell
> metacharacters etc.
> The most portable approach seems to be to always pass the command as a
> list, and to set shell=True on Windows and shell=False on Unix.
> The only reason to pass a command as a string is if you're getting a
> string from the user and you want it to be interpreted using the
> platform's standard shell (i.e. cmd.exe or /bin/sh). If you want it to be
> interpreted the same way regardless of platform, parse it into a
> list using shlex.split().
I understand; I think I was headed towards subprocess.Popen() either
way. It seems to handle the problem I posted about. And I got to learn
a little something on the way. Thanks!
Only now there's a new problem in that the output of the program is
different if I run it from Popen than if I run it from the command line.
The program in question is 'pdftotext'. More investigation to ensue.
Thanks again for the helpful post.
More information about the Python-list