[Tutor] Running Python Scripts at same time
Cameron Simpson
cs at cskk.id.au
Sat Jun 27 19:05:39 EDT 2020
On 26Jun2020 11:21, John Weller <john at johnweller.co.uk> wrote:
>I have a Python program which will be running 24/7 (I hope 😊). It is
>generating data in a file which I want to clean up overnight. The way
>I am looking at doing it is to run a separate program as a Cron job at
>midnight – will that work? The alternative is to add it to the loop
>and check for the time. I have tried researching this but only got even
>more confused.
Running a separate program is perfectly reasonable.
And crontab is a perfect place for a regular task like this.
The primary issue usually is that you do not want both programms to be
using the file at the same time.
Supposing the file were, say, a CSV file to which your long running
programme (A) appended data. ANd that the clean up program (B) reads the
CSV file, tidies some stuff, and rewrites the CSV file. You can imagine
this sequence:
- programme B opens the file and reads the data
- programme B thinks about the data to clean it
- programme A appends more data to the file
- programme B rewrites the clean data into the file,
_overwriting_ the new data programme A just appended
The usual process with a shared external file is to use a lock facility.
These come in a few forms, and it is essential that both programme A and
programme B use the same locking system.
One of the easiest and most portable is to make a lock file while you
work with the file. If your data file is called "foo" you might use a
lock fie called "foo.lock".
On a UNIX type system (includes Linux) you can atomicly make such a file
like this:
import os
.......
lockpath = datafilepath + '.lock'
lockfd = os.open(lockpath, os.O_CREAT | os.O_EXCL | os.O_RDWR, 0)
That is a special mode of the OS "open" call (_not_ Python's default
"open" builtin) whose parameters have the following meanings:
- os.O_CREAT: create the file if missing
- os.O_EXCL: ensure that the file is created - if it already exists
this raises an exception
- os.O_RDWR: open the file for read and write
- 0: the initial permissions, ensuring that the file is _not_
readable or writable
See "man 2 open" on a UNIX system for the spec.
The combination of O_RDWR and 0 permissions means that if the file
already exists (made by the "other" programme) then it won't have any
permissions, which means we won't get read or write access and the open
will fail. The nice thing about this is that the initial permissions are
_immediate_ when the file is created by the OS - there's no tiny window
where the file has read/write perms which then get removed - the OS
ensures it. This is nice on networked file shares (if they are
reliable).
Anyway, the upshort of the os.open() call above is that if the lockfile
already exists, the open will fail, and otherwise it will succeed,
preventing antoehr programme doing the same thing.
When finished, close the lockfd and remove the lock file:
os.close(lockfd)
os.remove(lockpath)
No, because the whole scenario is that occasionally both programms want
the file at the same time, the os.open _will_ fail in that case. SO the
idea is that you repeat it until it succeeds, then do your work:
while True:
try:
lockfd = os.open(lockpath, os.O_CREAT | os.O_EXCL | os.O_RDWR, 0)
except OSError as e:
print("lock not obtained, sleeping")
time.sleep(1)
else:
break
.... work with the data file ...
os.close(lockfd)
os.remove(lockpath)
Put that logic in both programmes and you should be ok.
You can see a more elaborate version of this logic in my "makelockfile"
function here:
https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/fileutils.py#lines-527
(Atlassian are going to nuke that repo soon, alas, because they find
mercurial too hard. But until then the link should be good.)
Cheers,
Cameron Simpson <cs at cskk.id.au>
More information about the Tutor
mailing list