transaction-like file operations
holger krekel
pyth at devel.trillke.net
Thu Aug 8 12:05:44 EDT 2002
Gerson Kurz wrote:
> I am working on a python program that will run on an embedded
> ppc-linux system. If, during a file write operation (to flash memory),
> the machine is powered off, the file gets corrupted, and is lost on
> the next reboot. (I cannot prevent the user from powering off the
> machine - there is no display, nor any indication if the system is
> right now writing to disk or not. Its an embedded system, after all).
>
> So, what I need is a transaction-like file operation, that allows me
> to either write the file completely, or keep the old file (so that at
> every point there is at least one set of data available).
>
> I have made a "homegrown" solution that includes writing to a backup
> file first, then doing two rename operations. I wonder if there exists
> a standard class for transaction-like file operations in python?
> [Note: a database is not an option]
You might like to check with reiserfs at it has database-like functionalities
and some extra guaranties over POSIX.
I think you can realize the wanted behaviour with *standard python*
not requiring any special modules. Python maps some important POSIX
system calls into the 'os' module. Read the man-pages of
rename and fsync
very carefully. I think you could *roughly* do:
import os
def update_file(updater, path):
tmppath = path + '.vfs_transaction'
newdir = open(os.path.dirname(path))
newfile = open(tmppath,'w')
updater(newfile)
os.fsync(newfile.fileno())
os.rename(tmppath, path) # posix guarantees atomicity!
os.fsync(newdir.fileno()) # persists updates of meta-data?!
class Writer:
count = 0
def __call__(self, file):
file.write(str(self.count) * 10000)
self.count += 1
writer = Writer()
update_file(writer, '/tmp/txtest.test')
the critical part is between the last two commands of 'update_file'.
The rename is atomic but not guaranteed to be 'persistent' at once.
The fsync on (the meta-data of) newdir should help but i don't
know this for sure. This probably depends on the filesystem
implementation and harddisk-caching. You could ask people on
the reiserfs-list (and report back, please :-).
Anyway, i recommend to dedicate a machine for some testing.
Run 10 processes looping with the above 'transactions'
and turn power off after some minutes. See if everything is as
consistent as you expect it. Best to do it with the target system :-)
Of course, there are some issues which need further thought
and discussion...
have fun,
holger
More information about the Python-list
mailing list