
Hello, I'm new here, so forgive me if this has been discussed before or is off-topic. I came up with a mechanism that I thought might be useful in the Python standard library -- a scope-bound self-restoring backup file. I came to this naïve implementation; -- class BackupError(Exception): pass class Backup: def __init__(self, path): if not os.path.exists(path) or os.path.isdir(path): raise BackupError("%s must be a valid file path" % path) self.path = path self.backup_path = None def __enter__(self): self.backup() def __exit__(self, type, value, traceback): self.restore() def _generate_backup_path(self): tempdir = tempfile.mkdtemp() basename = os.path.basename(self.path) return os.path.join(tempdir, basename) def backup(self): backup_path = self._generate_backup_path() shutil.copy(self.path, backup_path) self.backup_path = backup_path def restore(self): if self.backup_path: # Write backup back onto original shutil.copy(self.backup_path, self.path) shutil.rmtree(os.path.dirname(self.backup_path)) self.backup_path = None -- Backups are intended to be scope-bound like so: with Backup(settings_file): rewrite_settings(settings_file) do_something_else() I even managed to use it with the @contextmanager attribute, to allow this: with rewrite_settings(settings_file): do_something_else() So, open questions; - Would something like this be useful outside of my office? - Any suggestions for better names? - This feels like it belongs in the tempfile module, would you agree? - What's lacking in the implementation? Have I done something decidedly non-Pythonic? Thanks, - Kim

On 2012-06-25, at 14:17 , Kim Gräsman wrote:
- Would something like this be useful outside of my office?
I'm not sure I correctly understand the purpose of this, and if I do it seems to be kind-of a hack for "fixing" kind-of crummy code: is it correct that the goal is to temporarily edit a file (and restore it later) to change the behavior of *other* pieces of code reading the same file? So essentially dynamically scoping the content of a file? I find the idea rather troublesome/problematic, as it's completely blind to (and unsafe under) concurrent access, and will be tricky to handle cleanly wrt filesystem caches and commits. The initial mail hinted at atomic file replacement *or* backuping a file and restoring the backup on error, something along the lines of: with Backup(settings_file): alter_file() alter_file_2() alter_file_3() # altered file with Backup(settings_file): alter_file() alter_file_2() raise Exception("boom") alter_file_3() # old file is back in the same way e.g. Emacs will keep "~" files around during edition. That could have been a ~+1 for me, but the behavior as I understood it (understanding which may be incorrect, again) I'd be −1 on, it seems too dangerous and too tied to other issues in the code.

On Mon, Jun 25, 2012 at 8:17 AM, Kim Gräsman <kim@mvps.org> wrote:
I like the basic idea, but if we do something like this, it would be useful to have read access to the old version of the file while you are writing out the new version that might become permanent. If I was to implement something like this, I'd use a "right a temporary file then copy it overwriting the old one when I'm done" approach rather than a "back up the file" approach so that if the process dies for a reason Python can't clean up after (like due to SIGKILL), the half-written file doesn't remain. I don't really like the name Backup but I can't think of a better name at the moment. Mike

Am 25.06.2012 14:17, schrieb Kim Gräsman:
Are you aiming for atomic file rollover backed by a temporary file? That's the common way to safely overwrite an existing file. It works differently than your code. * Create a temporary file with O_CREAT | O_EXCL in the same directory as the file you like to replace * Write data to new file * Call sync() on the file as well as fdatasync() and fsync() on the file descriptor * close the file * use atomic rename to replace the old file with the new file (IIRC won't work atomically on Windows) I've some code laying around somewhere that implements a RolloverFile similar to tempfile.NamedTemporaryFile. Christian

On 6/25/2012 8:17 AM, Kim Gräsman wrote:
It seems to me that what you actually *want* to do, given your other responses, is to make a temporary altered copy of the settings file and get the programs to use the *copy*. That way, other users would see the original undistrubed and a crash would at worst leave the copy undeleted. (Whether you want to copy alterations back is a different matter.) I presume the problem is that the program has the name of the settings file hard-coded. One possibility might be to run the program in a virtual environment with its temporary copy. (But I have 0 experience with that. I only know that venv has been added to 3.3.) -- Terry Jan Reedy

Hi Terry, and all, On Mon, Jun 25, 2012 at 10:33 PM, Terry Reedy <tjreedy@udel.edu> wrote:
Thanks for all your alternative strategies! In this case, the third party is a combination of Python, shell script, and executable binaries in at least three different processes, and I'm pretty happy with the modify-do work-restore model for this batch script. I appreciate the input on the suggested idea, it gave me some new error modes to worry about, even if most of them don't apply for this specific case. - Kim

On 2012-06-25, at 14:17 , Kim Gräsman wrote:
- Would something like this be useful outside of my office?
I'm not sure I correctly understand the purpose of this, and if I do it seems to be kind-of a hack for "fixing" kind-of crummy code: is it correct that the goal is to temporarily edit a file (and restore it later) to change the behavior of *other* pieces of code reading the same file? So essentially dynamically scoping the content of a file? I find the idea rather troublesome/problematic, as it's completely blind to (and unsafe under) concurrent access, and will be tricky to handle cleanly wrt filesystem caches and commits. The initial mail hinted at atomic file replacement *or* backuping a file and restoring the backup on error, something along the lines of: with Backup(settings_file): alter_file() alter_file_2() alter_file_3() # altered file with Backup(settings_file): alter_file() alter_file_2() raise Exception("boom") alter_file_3() # old file is back in the same way e.g. Emacs will keep "~" files around during edition. That could have been a ~+1 for me, but the behavior as I understood it (understanding which may be incorrect, again) I'd be −1 on, it seems too dangerous and too tied to other issues in the code.

On Mon, Jun 25, 2012 at 8:17 AM, Kim Gräsman <kim@mvps.org> wrote:
I like the basic idea, but if we do something like this, it would be useful to have read access to the old version of the file while you are writing out the new version that might become permanent. If I was to implement something like this, I'd use a "right a temporary file then copy it overwriting the old one when I'm done" approach rather than a "back up the file" approach so that if the process dies for a reason Python can't clean up after (like due to SIGKILL), the half-written file doesn't remain. I don't really like the name Backup but I can't think of a better name at the moment. Mike

Am 25.06.2012 14:17, schrieb Kim Gräsman:
Are you aiming for atomic file rollover backed by a temporary file? That's the common way to safely overwrite an existing file. It works differently than your code. * Create a temporary file with O_CREAT | O_EXCL in the same directory as the file you like to replace * Write data to new file * Call sync() on the file as well as fdatasync() and fsync() on the file descriptor * close the file * use atomic rename to replace the old file with the new file (IIRC won't work atomically on Windows) I've some code laying around somewhere that implements a RolloverFile similar to tempfile.NamedTemporaryFile. Christian

On 6/25/2012 8:17 AM, Kim Gräsman wrote:
It seems to me that what you actually *want* to do, given your other responses, is to make a temporary altered copy of the settings file and get the programs to use the *copy*. That way, other users would see the original undistrubed and a crash would at worst leave the copy undeleted. (Whether you want to copy alterations back is a different matter.) I presume the problem is that the program has the name of the settings file hard-coded. One possibility might be to run the program in a virtual environment with its temporary copy. (But I have 0 experience with that. I only know that venv has been added to 3.3.) -- Terry Jan Reedy

Hi Terry, and all, On Mon, Jun 25, 2012 at 10:33 PM, Terry Reedy <tjreedy@udel.edu> wrote:
Thanks for all your alternative strategies! In this case, the third party is a combination of Python, shell script, and executable binaries in at least three different processes, and I'm pretty happy with the modify-do work-restore model for this batch script. I appreciate the input on the suggested idea, it gave me some new error modes to worry about, even if most of them don't apply for this specific case. - Kim
participants (7)
-
Christian Heimes
-
Christopher Reay
-
Kim Gräsman
-
Masklinn
-
Mike Graham
-
Sturla Molden
-
Terry Reedy