Saving a file "in the background" -- How?

Akira Li 4kir4.1i at
Fri Oct 31 13:07:42 CET 2014

Virgil Stokes <vs at> writes:

> While running a python program I need to save some of the data that is
> being created. I would like to save the data to a file on a disk
> according to a periodical schedule  (e.g. every 10
> minutes). Initially, the amount of data is small (< 1 MB) but after
> sometime the amount of data can be >10MB. If a problem occurs during
> data creation, then the user should be able to start over from the
> last successfully saved data.
> For my particular application, no other file is being saved and the
> data should always replace (not be appended to) the previous data
> saved. It is important that  the data be saved without any obvious
> distraction to the user who is busy creating more data. That is, I
> would like to save the data "in the background".
> What is a good method to perform this task using Python 2.7.8 on a
> Win32 platform?

There are several requirements:

- save data asynchroniously -- "without any obvious distraction to the
- save data durably -- avoid corrupting previously saved data or
  writing only partial new data e.g., in case of a power failure
- do it periodically -- handle drift/overlap gracefully in a documented

A simple way to do asynchronios I/O on Python 2.7.8 on a Win32 platform
is to use threads:

  t = threading.Thread(target=backup_periodically, kwargs=dict(period=600))
  t.daemon = True # stop if the program exits

where backup_periodically() backups data every period seconds: 

  import time
  def backup_periodically(period, timer=time.time, sleep=time.sleep):
      start = timer()
      while True:
          except Exception: # log exceptions and continue
          # lock with the timer 
          sleep(period - (timer() - start) % period) 

To avoid drift over time of backup times, the sleep is locked with the
timer using the modulo operation. If backup() takes longer than *period*
seconds (unlikely for 10MB per 10 minutes) then the step may be

backup() makes sure that the data is saved and can be restore at any

  def backup():
      with atomic_open('backup', 'w') as file: 

where atomic_open() [1] tries to overcome multiple issues with saving
data reliably:

- write to a temporary file so that the old data is always available
- rename the file when all new data is written, handle cases such as:
  * "antivirus opens old file thus preventing me from replacing it"

either the operation succeeds and 'backup' contains new data or it fails
and 'backup' contains untouched ready-to-restore old data -- nothing in


I don't know how ready but you should be aware of the
issues it is trying to solve if you want a reliable backup solution.


More information about the Python-list mailing list