Security implications of using open() on untrusted strings.

Jorgen Grahn grahn+nntp at
Mon Nov 24 21:00:38 CET 2008

On Mon, 24 Nov 2008 00:44:45 -0500, r0g < at> wrote:
> Hi there,
> I'm trying to validate some user input which is for the most part simple
> regexery however I would like to check filenames and I would like this
> code to be multiplatform.
> I had hoped the os module would have a function that would tell me if a
> proposed filename would be valid on the host system but it seems not. I
> have considered whitelisting but it seems a bit unfair to make the rest
> of the world suffer the naming restrictions of windows. Moreover it
> seems both inelegant and hard work to research the valid file/directory
> naming conventions of every platform that this app could conceivably run
> on and write regex's for all of them so...
> I'm tempted to go the witch dunking route, stick it in an open() between
> a Try: & Except: and see if it floats. However...
> Although it's a desktop (not internet facing) app I'm a little squeamish
> piping raw user input into a filesystem function like that and this app
> will be dealing with some particularly sensitive data so I want to be
> careful and minimize exposure where practical.

Take the Unix 'ls' command (or MS-DOS 'dir').  That's two programs
which let users pipe raw input into the filesystem functions, and they
certainly have handled some very sensitive data over the years.

> Has programming PHP and Web stuff for years made me overly paranoid
> about this [...]

Yes. ;-)

Please explain one thing: what are you looking for?  It's not
"accesses a file outside the user's home directory", "accesses an
infinite file like /dev/zero" or something like that, or you would
have said so.  Nor seems the "user" input come from some other user
than the one your program is running as, nor from some input source
which the user cannot be held responsible for.

Seems to me you simply want to know beforehand that the reading will
work.  But you can never check that!  You can stat(2) the file, or
open-and-close it -- and then a microsecond later, someone deletes the
file, or replaces it with another one, or write-protects it, or mounts
a file system on top of its directory, or drops a nuke over the city,
or ...

Two more notes:

- is not like os.system. If ends up doing
  anything other than trying to open the file corresponding to the
  string you feed it, it's Python's fault, not yours.

  Compare with a language (does Perl allow this?) where if the string
  is "rm -rf /|", open will run "rm -rf /" and start reading its output.
  *That* interface would have been 

- if the OS ends up doing something different when calling open(2) or
  creat(2) or whatever using that string, it's the OSes fault, not

Or am I missing something?


  // Jorgen Grahn <grahn@        Ph'nglui mglw'nafh Cthulhu
\X/>          R'lyeh wgah'nagl fhtagn!

More information about the Python-list mailing list