[Python-Dev] Improved tmpfile module

Greg Ward gward@python.net
Tue, 25 Jun 2002 11:08:16 -0400


On 24 June 2002, Zack Weinberg said:
> Attached please find a rewritten and improved tmpfile.py.  The major
> change is to make the temporary file names significantly harder to
> predict.  This foils denial-of-service attacks, where a hostile
> program floods /tmp with files named @12345.NNNN to prevent process
> 12345 from creating any temp files.  It also makes the race condition
> inherent in tmpfile.mktemp() somewhat harder to exploit.

Oh, good!  I've long wished that there was a tmpfile module written by
someone who understands the security issues involved in generating
temporary filenames and files.  I hope you do... ;-)

> (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file,
> returning both an OS-level file descriptor open on it and its name.
> This is useful in situations where you need to know the name of the
> temporary file, but can't risk the race in mktemp.

+1 except for the name.  What does the "s" stand for?  Unfortunately, I
can't think of a more descriptive name offhand.

> name = mkdtemp(suffix=""): Creates a temporary directory, without
> race.

How about calling this one mktempdir() ?

> file = NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""): This is
> just the non-POSIX version of tmpfile.TemporaryFile() made available
> on all platforms, and with the .path attribute documented.  It
> provides a convenient way to get a temporary file with a name, that
> will be automatically deleted on close, and with a high-level file
> object associated with it.

I've scanned your code and the existing tempfile.py.  I don't understand
why you rearranged things.  Please explain why your arrangement of
_TemporaryFileWrapper/TemporaryFile/NamedTemporaryFile is better than
what we have.

A few minor comments on the code...

> if os.name == 'nt':
>     _template = '~%s~'
> elif os.name in ('mac', 'riscos'):
>     _template = 'Python-Tmp-%s'
> else:
>     _template = 'pyt%s' # better ideas?

Why reveal the implementation language of the application creating these
temporary names?  More importantly, why do it certain platforms, but not
others?

> ### Recommended, user-visible interfaces.
> 
> _text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL
> if os.name == 'posix':
>     _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL

Why not just "_bin_openflags = _text_openflags" ?  That clarifies their
equality on Unix.

> else:
>     _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY

Why not "_bin_openflags = _text_openflags | os.O_BINARY" ?

> def mkstemp(suffix="", binary=1):
>     """Function to create a named temporary file, with 'suffix' for
>     its suffix.  Returns an OS-level handle to the file and the name,
>     as a tuple.  If 'binary' is 1, the file is opened in binary mode,
>     otherwise text mode (if this is a meaningful concept for the
>     operating system in use).  In any case, the file is readable and
>     writable only by the creating user, and executable by no one."""

"Function to" is redundant.  That docstring should probably look
something like this:

    """Create a named temporary file.

    Create a named temporary file with 'suffix' for its suffix.  Return
    a tuple (fd, name) where 'fd' is an OS-level handle to the file, and
    'name' is the complete path to the file.  If 'binary' is true, the
    file is opened in binary mode, otherwise text mode (if this is a
    meaningful concept for the operating system in use).  In any case,
    the file is readable and writable only by the creating user, and
    executable by no one (on platforms where that makes sense).
    """

Hmmm: if suffix == ".bat", the file is executable on some platforms.
That last sentence still needs work.

>     if binary: flags = _bin_openflags
>     else: flags = _text_openflags

I dunno if the Python coding standards dictate this, but I prefer

    if binary:
        flags = _bin_openflags
    else:
        flags = _text_openflags


> class _TemporaryFileWrapper:
>     """Temporary file wrapper
> 
>     This class provides a wrapper around files opened for temporary use.
>     In particular, it seeks to automatically remove the file when it is
>     no longer needed.
>     """

Here's where I started getting confused.  I don't dispute that the
existing code could stand some rearrangement, but I don't understand why
you did it the way you did.  Please clarify!

> ### Deprecated, user-visible interfaces.
> 
> def mktemp(suffix=""):
>     """User-callable function to return a unique temporary file name."""
>     while 1:
>         name = _candidate_name(suffix)
>         if not os.path.exists(name):
>             return name

The docstring for mktemp() should state *why* it's bad to use this
function -- otherwise people will say, "oh, this looks like it does what
I need" and use it in ignorance.  So should the library reference
manual.

Overall I'm +1 on the idea of improving tempfile with an eye to
security.  +0 on implementation, mainly because I don't understand how
your arrangement of TemporaryFile and friends is better than what we
have.

        Greg
-- 
Greg Ward - geek                                        gward@python.net
http://starship.python.net/~gward/
What the hell, go ahead and put all your eggs in one basket.