[Python-ideas] [Python-Dev] Ext4 data loss

Steven D'Aprano steve at pearwood.info
Fri Mar 13 02:18:55 CET 2009

On Thu, 12 Mar 2009 12:26:40 pm zooko wrote:
> > Would there be interest in a filetools module? Replies and
> > discussion to python-ideas please.
> I've been using and maintaining a few filesystem hacks for, let's
> see, almost nine years now:
> http://allmydata.org/trac/pyutil/browser/pyutil/pyutil/fileutil.py
> (The first version of that was probably written by Greg Smith in
> about 1999.)
> I'm sure there are many other such packages.  A couple of quick
> searches of pypi turned up these two:
> http://pypi.python.org/pypi/Pythonutils
> http://pypi.python.org/pypi/fs
> I wonder if any of them have the sort of functionality you're
> thinking of.

Close, but not quite.

I'm suggesting a module with a collection of subclasses of file that 
exhibit modified behaviour. For example:

class FlushOnWrite(file):
    def write(self, data):
        super(FlushOnWrite, self).write(data)
    # similarly for writelines

class SyncOnWrite(FlushOnWrite):
    # ...

class SyncOnClose(file):
    # ...

plus functions which implement common idioms for safely writing data, 
making backups on a save, etc. A common idiom for safely over-writing a 
file while minimising the window of opportunity for file loss is:

write to a temporary file and close it
move the original to a backup location
move the temporary file to where the original was
if no errors, delete the backup

although when I say "common" what I really mean is that it should be 
common, but probably isn't :-/ The sort of file handling that is 
complicated and tedious to get right, and so most developers don't 
bother, and those that do are re-inventing the wheel.

There's a couple of recipes in the Python Cookbook which might be 
candidates. E.g. the first edition has recipes "Versioning Filenames" 
by Robin Parmar and "Module: Versioned Backups" by Mitch Chapman.

What I DON'T mean is pathname utilities. Nor do I mean mini-applications 
that operate on files, like renaming file extensions, deleting files 
that meet some criterion, etc. I don't think they belong in the 
standard library, and even if they do, they don't belong in this 
proposed module.

My intention is to offer a standard set of tools so people can choose 
the behaviour that suits their application best, rather than trying to 
make file() a one-size-fits-all solution.

Steven D'Aprano

More information about the Python-ideas mailing list