Parallel(?) programming with python
Cameron Simpson
cs at cskk.id.au
Mon Aug 8 22:30:53 EDT 2022
On 09Aug2022 00:22, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>On Mon, 8 Aug 2022 at 19:01, Andreas Croci <andrea.croci at gmx.de> wrote:
>> Basically the question boils down to wether it is possible to have
>> parts
>> of a program (could be functions) that keep doing their job while other
>> parts do something else on the same data, and what is the best way to do
>> this.
Which is of course feasible, as others have outlined.
>Why do these "parts of a program" need to be part of the *same*
>program. I would write this as just two separate programs. One
>collects the data and writes it to a file. The other periodically
>reads the file and computes the DFT.
I would also write these as separate programmes, or at least as distinct
modes of the same programme (eg "myprog poll" and "myprog archive" etc).
Largely because you might run the "poll" regularly and briefly, and the
processes phase separately and less frequently. You don't need to keep a
single programme lurking around forever - fire it up as required.
However, I want to point out that this _in no way_ removes the need for
access contol and mutexes. It will change the mechanism (because your
two programmes are now operating separately) and makes it more concrete
in your mind what _actually and precisely_ needs protection.
For example, you probably want to avoid _processing_ a data file at the
same time as _updating_ that file. Depending on what you're doing this
can be as simple as keeping "to be updated" files with distinct names
from "available to be processed/archived" files. This is a standard
difficulty with "hot folder" upload areas.
A common approach might be to write a file with a "temp" style name (eg
".tmp*") until completed, then rename it to its official name (eg
"datafile*"). And then your processing/archiving side can simply ignore
the "in progress" files because they do not match the names it cares
about.
Anyway, those are specifics, which will be driven by what you're
actually doing. The point is that you still need to coordinate use of
the files suitably for your needs. Doing this in one long running
programme with Threads/mutexes or separate programmes sharing a data
directory just changes the mechanisms.
Cheers,
Cameron Simpson <cs at cskk.id.au>
More information about the Python-list
mailing list