[Python-Dev] Re: RELEASED Python 2.3.1

"Martin v. Löwis" martin at v.loewis.de
Sat Sep 27 13:48:48 EDT 2003


Jeremy Hylton wrote:

> I suppose ZODB is such an expert application.  It has to cope with
> systems that do not provide fsync(), but it provides degraded service on
> such platforms.  It is very important for the database to call fsync()
> when it commits a transaction.

I mostly agree: ZODB is indeed advanced, and it is indeed a good idea
to check for presence of os.fsync before using it.

While this is OT, I'd still like to question the usefulness of fsync(2)
in the first place, for applications like ZODB. I assume fsync is used
as a write barrier, to make sure old modifications are on disk, before
making new changes. There are several cases in which this might be relevant:
1. The application will crash soon after performing fsync, leaving data
    potentially in an inconsistent state. Here, using fsync is not
    necessary, as the system will still perform all modifications on
    disk, even though the process has long terminated.
2. The application will be kill(2)ed soon after fsync completes (or
    even while fsync completes). Like 1), fsync is not needed.
3. There is an operating system crash (kernel panic or similar).
    fsync does not help, as, for a buggy kernel, anything might have
    happened to the data before.
4. There is a disk failure. fsync does not help, as the data on disk
    might not be recoverable.
5. There is a power outage. This is the case where fsync should help:
    everything up to the write barrier is on disk. Of course, if the
    disk drive itself has write caching, fsync may have completed
    without the data being on the disk (this would be an fsync bug, but
    I believe Linux suffers from this particular bug).

So in short, fsync(2) helps only in case of a power outage; for normal
operation, it is not needed. In the case of a power outage, it is
doubtful whether it has the desired effect.

Slightly more on-topic: os.fsync is even worse, as it cannot be
used to implement a write barrier in case of multiple threads. It
runs with the GIL released, so while fsync is running, other
threads might change the file. The semantics of fsync in this case
is pretty unclear, but it is likely not a write barrier.

Regards,
Martin





More information about the Python-Dev mailing list