Saving a file "in the background" -- How?
4kir4.1i at gmail.com
Fri Oct 31 13:07:42 CET 2014
Virgil Stokes <vs at it.uu.se> writes:
> While running a python program I need to save some of the data that is
> being created. I would like to save the data to a file on a disk
> according to a periodical schedule (e.g. every 10
> minutes). Initially, the amount of data is small (< 1 MB) but after
> sometime the amount of data can be >10MB. If a problem occurs during
> data creation, then the user should be able to start over from the
> last successfully saved data.
> For my particular application, no other file is being saved and the
> data should always replace (not be appended to) the previous data
> saved. It is important that the data be saved without any obvious
> distraction to the user who is busy creating more data. That is, I
> would like to save the data "in the background".
> What is a good method to perform this task using Python 2.7.8 on a
> Win32 platform?
There are several requirements:
- save data asynchroniously -- "without any obvious distraction to the
- save data durably -- avoid corrupting previously saved data or
writing only partial new data e.g., in case of a power failure
- do it periodically -- handle drift/overlap gracefully in a documented
A simple way to do asynchronios I/O on Python 2.7.8 on a Win32 platform
is to use threads:
t = threading.Thread(target=backup_periodically, kwargs=dict(period=600))
t.daemon = True # stop if the program exits
where backup_periodically() backups data every period seconds:
def backup_periodically(period, timer=time.time, sleep=time.sleep):
start = timer()
except Exception: # log exceptions and continue
# lock with the timer
sleep(period - (timer() - start) % period)
To avoid drift over time of backup times, the sleep is locked with the
timer using the modulo operation. If backup() takes longer than *period*
seconds (unlikely for 10MB per 10 minutes) then the step may be
backup() makes sure that the data is saved and can be restore at any
with atomic_open('backup', 'w') as file:
where atomic_open()  tries to overcome multiple issues with saving
- write to a temporary file so that the old data is always available
- rename the file when all new data is written, handle cases such as:
* "antivirus opens old file thus preventing me from replacing it"
either the operation succeeds and 'backup' contains new data or it fails
and 'backup' contains untouched ready-to-restore old data -- nothing in
I don't know how ready atomicfile.py but you should be aware of the
issues it is trying to solve if you want a reliable backup solution.
More information about the Python-list