Johann Borck schrieb:
[...] I think the main misunderstanding is "[..]to use the reactor's scheduling mechanism instead of running it in a thread." Twisteds reactor is not a superior multi-purpose scheduler (as JP mentioned), but a domain-specific event handler for networking. While your use-case might (that's my guess) profit from choosing 'chunking' over pythons threading, it still wouldn't from choosing it over the scheduling of your OS.
hm, did I get you right there?
Oh yes, you're right. The whole time I thought of the reactor as a multi-purpose scheduler and didn't get JP's answer right. The misunderstanding in part results from the core documentation. In chapter 1.3, it is said: "This document will give you a high level overview of concurrent programming (interleaving several tasks) and of Twisted's concurrency model: non-blocking code or asynchronous code." Then, two examples follow: (1) CPU bound tasks and (2) tasks that wait for data. Because I thought the introduction applied to both type of examples, I also assumed that Twisted's concurrency model would apply for both types of tasks. Of course, I already wondered about not being able to find any examples for (1), while the whole rest of the docs deals with (2). ;-) Moreover, I once read Douglas Schmidt's book "Pattern Oriented Software Architecture (2)", which describes several patterns - including the reactor pattern - for middleware-oriented applications. The book led to some confusion on mine about when to use which pattern and what concurrency mechanism would be best for a particular situation. The - maybe wrong - conclusion I've drawn from that book is that context switching overhead (be it threads or processes) isn't only bad for I/O bound tasks, but also for most other concurrent tasks. As you said, function calls in python are expensive, nevertheless I thought they were less expensive than the overhead caused by context switching between threads and processes, at least on a single processor system. Or have I made a mistake here? Moreover, couldn't the creation of whole new processes be even more expensive? I mean, with "long" running algorithms I really meant tasks that could take some minutes. Processes would be very fine here. But what about - for example - CPU bound tasks that only take some hundred milliseconds, but nevertheless would block the reactor? Would you use processes in this case, too? Maybe prespawned processes? Or should I use rather threads in such a case? Many thanks for your enlighting reply, Jürgen