[Python-Dev] Re: RELEASED Python 2.3.1

Sat Sep 27 22:08:31 EDT 2003

[Martin v. Löwis]
> I mostly agree: ZODB is indeed advanced, and it is indeed a good idea
> to check for presence of os.fsync before using it.
>
> While this is OT, I'd still like to question the usefulness of
> fsync(2) in the first place, for applications like ZODB. I assume
> fsync is used as a write barrier, to make sure old modifications are
> on disk, before making new changes.

That's important, but not primarily for the catastrophic error-recovery
scenarios you go on to sketch.

The most important error-recovery procedure is preventative, running backups
against a ZODB database while the database is active (Tools/repozo.py in a
recent ZODB distribution is the right tool for this).  Since a ZODB process
may run for months, it's not practical to say that backups require shutting
ZODB down.

Without both flushing and fsync'ing, the backup process can't get at all the
data that's "really" in the file.  Here's a little Python driver:

import os
fp = file('test.dat', 'wb')
guts = 'x' * 1000000

n = 0
while 1:
    fp.write(guts)
    fp.flush()
    os.fsync(fp.fileno())
    n += len(guts)
    proceed = raw_input("wrote %d so far; hit key" % n)

At least on Windows, both the flush and the fsync are necessary to see one
million bytes (via a different process) at the first prompt, two million at
the second, and so on.  With neither, another process typically sees 0 bytes
before the file gets huge.  With just one of them, it seems hard to predict,
ranging from 0 to "almost" a million additional bytes per prompt.

ZODB does a flush and an fsync after each transaction, so that the backup
script (or any other distinct process) sees the most recent data available.

Besides missing large gobs of newer data, without the fsync the backup
script may see incomplete data for the most recent transaction that managed
to wind up on disk, and erroneously conclude that the database is corrupted.

In short, fsync is necessary to support ZODB best practice.