Also sorry for the duplicate, Steven.
On Thu, Dec 24, 2020 at 12:15:08PM -0500, Michael A. Smith wrote:
> With all the buffering that modern disks and filesystems do, a
> specific question has come up a few times with respect to whether or
> not data was actually written after flush. I think it would be pretty
> useful for the standard library to have a variant in the io module
> that would explicitly fsync on close.
One argument against this idea is that "disks and file systems buffer
for a reason, you should trust them, explicitly calling sync after every
written file is just going to slow I/O down".
Personally I don't believe this argument, I've been bitten many, many
times until I learned to explicitly sync files, but its an argument you
playing devil's advocate. I can see this argument came up in some other
branches on this thread. I'll address it there.
Another argument is that even syncing your data doesn't mean that the
data is actually written to disk, since the hardware can lie. On the
other hand, I don't know what anyone can do, not even the kernel, in the
face of deceitful hardware.
Death and taxes. The best we could do here is address it in the documentation as something else to be aware of.
That would be fine with me.
> 3. There are many ways to do this, and I think several of them could
> be subtly incorrect.
Can you elaborate?
I mean, the obvious way is:
with open(..., 'w') as f:
I think this
illustrates my point. Aside from os.sync syncing more than the user
might expect, I believe the sync here happens after `f` is closed. IIUC,
it's safest to fsync between flush and close.
We'd also probably want to fdatasync() instead of fsync or os.sync if we can.
think that since open is already a context manager, putting another
context manager in play is not as desirable as another keyword argument