[Python-Dev] Ext4 data loss

A.M. Kuchling amk at amk.ca
Wed Mar 11 03:14:51 CET 2009


On Wed, Mar 11, 2009 at 11:31:52AM +1100, Cameron Simpson wrote:
> On 10Mar2009 18:09, A.M. Kuchling <amk at amk.ca> wrote:
> | The mailbox module tries to be careful and always fsync() before
> | closing files, because mail messages are pretty important.
> 
> Can it be turned off? I hadn't realised this.

No, there's no way to turn it off (well, you could delete 'fsync' from
the os module).

> | The tarfile, zipfile, and gzip/bzip2 classes don't seem to use fsync()
> | at all, either implicitly or by having methods for calling them.
> | Should they?  What about cookielib.CookieJar?
> 
> I think they should not do this implicitly. By all means let a user
> issue policy.

The problem is that in some cases the user can't issue policy.  For
example, look at dumbdbm._commit().  It renames a file to a backup,
opens a new file object, writes to it, and closes it.  A caller can't
fsync() because the file object is created, used, and closed
internally.  With zipfile, you could at least access the .fp attribute
to sync it (though is the .fp documented as part of the interface?).

In other words, do we need to ensure that all the relevant library
modules expose an interface to allow requesting a sync, or getting the
file descriptor in order to sync it?

--amk


More information about the Python-Dev mailing list