Hi. I am back. First of all thanks for your eager participation. I would like to catch on on Steve's and Mark's examples as they seem to be very good illustrations of what issue I still have.
Steve explained why asyncio is great and Mark explained why threading+multiprocessing is great. Each from his own perspective and focusing on the internal implementation details. To me, all approaches can now be fit into this sort of table. Please, correct me if it's wrong (that is very important):
# | code lives in | managed by
--+---------------+------------- 1 | processes | os scheduler 2 | threads | os scheduler 3 | tasks | event loop
But the original question still stands:
Which one to use?
Ignoring little details like 'shared state', 'custom prioritization', etc., they all look the same to me and to what it all comes down are these little nasty details people try to explain so eagerly. Not saying that is a bad thing but it has some implications on production code I do not like and in the following I am going to explain that.
Say, we have decided for approach N because of some requirements (examples from here and there, guidelines given by smart people, customer needs etc.) and wrote hundred thousand lines of code. What if these requirements change 6 years in the future? What if the maintainer of approach N decided to change it in such a way that is not compatible with our requirements anymore? From what I can see there is no easy way 'back' to use another approach. They all have different APIs, basically for: 'executing a function and returning its precious result (the cake)'.
asyncio gives us the flexibility to choose a prioritization mechanism. Nice to have, because we are now independent on the os scheduler. But do we really ever need that? What is wrong with the os scheduler? Would that not mean that Mark better switches to asyncio? We don't know if we ever would need that in project A and project B. What now? Use asyncio just in case? Preemptively?
@Steve Thanks for that great explanation of how asyncio works and its relationship to threads/processes.
But I still have a question: why can't we use threads for the cakes? (1 cake = 1 thread). Not saying that asyncio would be a bad idea to use here, but couldn't we accomplish the same functionality by using threads?
I think, after we've settled the above questions, we should change the focus from
How do they work internally and what are the tiny differences?
(answered greatly by Mark)
When do I use which one?
The latter question actually is what counts for production code. It actually is quite interesting to know and to ponder over all the differences, dependencies, corner cases etc. However, when it actually comes down to 'executing a piece of code and returning its result', you end up deciding which approach to choose. You won't implement all 3 different ways just because it is great to see all the nasty little details to click in.
On Thursday, July 9, 2015 at 11:54:11 PM UTC+1, Sven R. Kunze wrote: >
In order to make a sound decision for the question: "Which one(s) do I use?", at least the following items should be somehow defined clearly for these modules:
1) relationship between the modules 2) NON-overlapping usage scenarios 3) future development intentions 4) ease of usage of the modules => future syntax 5) examples