Threads vs. processes, what to consider in choosing ?
philip at semanchuk.com
Tue Feb 17 18:08:29 CET 2009
On Feb 17, 2009, at 10:18 AM, Barak, Ron wrote:
> I have a wxPython application that builds an internal database from
> a list of files and then displays various aspects of that data,
> in response to user's requests.
> I want to add a module that finds events in a set of log files
> These log files are potentially huge, and the initial processing is
> lengthy (several minutes).
> Thus, when the user will choose LogManager, it would be unacceptable
> to block the other parts of the program, and so - the initial
> LogManager processing
> would need to be done separately from the normal run of the program.
> Once the initial processing is done, the main program would be
> notified and could display the results of LogManager processing.
> I was thinking of either using threads, or using separate processes,
> for the main programs and LogManager.
> What would you suggest I should consider in choosing between the two
> options ?
> Are there other options besides threads and multi-processing ?
The general rule is that it is a lot easier to share data between
threads than between processes. The multiprocessing library makes the
latter easier but is only part of the standard library in Python >=
2.6. The design of your application matters a lot. For instance, will
the processing code write its results to a database, ping the GUI code
and then exit, allowing the GUI to read the database? That sounds like
an excellent setup for processes.
In addition, there's the GIL to consider. Multi-process applications
aren't affected by it while multi-threaded applications may be. In
these days where multi-processor/multi-core machines are more common,
this fact is ever more important. Torrents of words have been written
about the GIL on this list and elsewhere and I have nothing useful to
add to the torrents. I encourage you to read some of those
FWIW, when I was faced with a similar setup, I went with multiple
processes rather than threads.
Last but not least, since you asked about alternatives to threads and
multiprocessing, I'll point you to some low level libraries I wrote
for doing interprocess communication:
More information about the Python-list