Recommendations in terms of threading, multi-threading and/or asynchronous processes/programming? - Sent Mail - Mozilla Thunderbird
jacob kruger
jacob.kruger.work at gmail.com
Sun Jan 8 06:49:38 EST 2023
Ok, the specific usage case right now is that I need to set up a process
pulling contents of e-mail messages from an IMAP protocol mail server,
which I then populate into a postgresql database, and, since this is the
inbox of a relatively large-scale CRM/support system, there are
currently over 2.5 million e-mails in the inbox, but, it can grow by
over 50000 per day.
I already have the basic process operating, using imap_tools, but,
wanted to enable you to query the process during run-time, without
needing to either check logs, or query the database itself while it is
on-the-go - even if this is just for initial population time-period,
since later on I will just set up code to run under a form of cron job,
or handling time-based repeats itself on a separate machine.
Also wanted to offer the ability to either pause, or terminate processes
while it's busy batch processing large chunks of e-mail messages -
either send a message to the thread, or set a global variable to tell it
to end the run after the current process item has finished off, just in
case.
So, I think that for now, threading is probably the simplest to look into.
Later on, was also considering forms of low-level monitoring for UI
elements, but, this is not really related to initial task, but, could
almost relate to forms of non-visual gaming interfaces, for blind/VI
individuals - I am myself 100% blind, but, that's not really relevant in
this context.
Stay well
Jacob Kruger
+2782 413 4791
"Resistance is futile...but, acceptance is versatile..."
On 2023/01/06 21:19, Chris Angelico wrote:
> On Sat, 7 Jan 2023 at 04:54, jacob kruger <jacob.kruger.work at gmail.com> wrote:
>> I am just trying to make up my mind with regards to what I should look
>> into working with/making use of in terms of what have put in subject line?
>>
>>
>> As in, if want to be able to trigger multiple/various threads/processes
>> to run in the background, possibly monitoring their states, either via
>> interface, or via global variables, but, possibly while processing other
>> forms of user interaction via the normal/main process, what would be
>> recommended?
>>
> Any. All. Whatever suits your purpose.
>
> They all have different goals, different tradeoffs. Threads are great
> for I/O bound operations; they're easy to work with (especially in
> Python), behave pretty much like just having multiple things running
> concurrently, and generally are the easiest to use. But you'll run
> into limits as your thread count climbs (with a simple test, I started
> seeing delays at about 10,000 threads, with more serious problems at
> 100,000), so it's not well-suited for huge scaling. Also, only one
> thread at a time can run Python code, which limits them to I/O-bound
> tasks like networking.
>
> Multiple processes take a lot more management. You have to carefully
> define your communication channels (for instance, a
> multiprocessing.Queue() to collect results), but they can do CPU-bound
> tasks in parallel. So multiprocessing is a good way to saturate all of
> your CPU cores. Big downsides include it being much harder to share
> information between the processes, and much MUCH higher resource usage
> than threads (with the same test as the above, I ran into limitations
> at just over 500 processes - way fewer than the 10,000 threads!).
>
> Asynchronous I/O runs a single thread in a single process. So like
> multithreading, it's only good for I/O bound tasks like networking.
> It's harder to work with, though, since you have to be very careful to
> include proper await points, and you can stall out the entire event
> loop with one mistake (common culprits being synchronous disk I/O, and
> gethostbyname). But the upside is that you get near-infinite tasks,
> basically just limited by available memory (or other resources).
>
> Use whichever one is right for your needs.
>
> ChrisA
More information about the Python-list
mailing list