Safely modify a file in place -- am I doing it right?

Chris Torek nospam at torek.net
Wed Jun 29 16:31:29 EDT 2011


In article <4e0b6383$0$29996$c3e8da3$5496439d at news.astraweb.com>
 <steve+comp.lang.python at pearwood.info> wrote:
>I have a script running under Python 2.5 that needs to modify files in
>place. I want to do this with some level of assurance that I won't lose
>data. ... I have come up with this approach:

[create temp file in suitable directory, write new data, and
use os.rename() to atomically swap out the old file for the
new]

As Grant Edwards said, this is the right general idea.  There
are lots of variations.  If you want to make the original
be a backup, the sequence:

    os.link(original_name, backup_name)
    os.rename(new_synced_file, original_name)

should generally do the trick (rename will unlink the target
which means that the backup name will refer to the original
inode).

>import os, tempfile
>def safe_modify(filename):
>    fp = open(filename, 'r')
>    data = modify(fp.read())
>    fp.close()
>    # Use a temporary file.
>    loc = os.path.dirname(filename)
>    fd, tmpname = tempfile.mkstemp(dir=loc, text=True)
>    # In my real code, I need a proper Python file object, 
>    # not just a file descriptor.
>    outfile = os.fdopen(fd, 'w')
>    outfile.write(data)
>    outfile.close()

It is a good idea to use outfile.flush() and then os.fsync() before
doing the close, as well.  Among other things, this *usually* gets
you some kind of notice-of-failure in the case of deferred writes
across a network (e.g., NFS).  (While it would be nice for os.close()
to deliver failure notices, in practice the fsync() is at least
sometimes required.  This is the OS's fault, not Python's. :-) )

>    # Move the temp file over the original.
>    os.rename(tmpname, filename)
>
>os.rename is an atomic operation, at least under Linux and Mac,
>so if the move fails, the original file should be untouched.
>
>This seems to work for me, but is this the right way to do it?
>Is there a better/safer way?

For additional checking and cleanup purposes, you may want to catch
exceptions and delete the temporary file if the rename has not yet
been done (and therefore the original file is still intact).

You will likely also need to fiddle with the permission bits
on the file resulting from the mkstemp() call (to make them
match those on the original file).  Alternatively, you may want
to build your own mkstemp() (this can be a bit of a challenge!).

Finally, as I implied above in talking about the os.link()-then-
os.rename() sequence, if the original file has multiple links to
it, note that this "breaks the links".  If this is not what you
want, the problem has no fully general solution (but there are
various application-specific solutions).
-- 
In-Real-Life: Chris Torek, Wind River Systems
Intel require I note that my opinions are not those of WRS or Intel
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html



More information about the Python-list mailing list