[concurrency] Common Concurrent Problems

Wed Jun 10 17:29:03 CEST 2009

I think we should sample a range of parallel problems so we can  
benchmark implementations based on some objective criteria.

Concurrency problems exist on a scale that ranges from  
"embarrassingly parallel" to "inherently sequential" and can be  
benchmarked by Amdahl's Law: Some tasks can not be computed in  
parallel, and the minimum time to compute a parallel task depends on  
the maximal serially computed component. This is an ideal that will  
allow objective scoring the "concurrency potential" of a given  
problem with Big-O notation.

http://en.wikipedia.org/wiki/Embarrassingly_parallel
http://en.wikipedia.org/wiki/Amdahl%27s%5Flaw
[notice the urlencoded title in the URL looks like Amdahl's Flaw]

In reality, doing computing in parallel comes with two overhead  
factors. First you have to distribute the workload. That means  
setting up each parallel execution context and dealing out the  
workload. Second, you have to coalesce (some of) the computed  
products of those parallel operations. Besides benchmarking  
performance, any given approach/tool/API may require different levels  
of effort to implement/read/debug those two chunks.

So I guess what I'm suggesting is a three dimensional matrix of  
benchmarks: x Frameworks/APIs/Tools, y representative sample problems  
on the scale of concurrency potential, and z performance measurements.

On Jun 9, 2009, at 9:39 AM, Pete wrote:

> On Jun 8, 2009, at 5:10 PM, Paul Boddie wrote:
>
> On http://wiki.python.org/moin/Concurrency/99Bottles
>
>> even made a remark on the "99 Concurrent Bottles of Beer" page  
>> which appears
>> to have been wide of the mark - that one would surely try and use  
>> operating
>> system features in the given example in order to provide a more  
>> optimal
>
> To clarify:  my objective with this page is to give potential users  
> a general sense of what code using a variety of toolkits/libraries/ 
> techniques looks like.  The technical term for such a collection is  
> a chrestomathy: "a collection of similar programs written in  
> various programming languages, for the purpose of demonstrating  
> differences in syntax, semantics and idioms for each language"   
> http://en.wikipedia.org/wiki/Program_Chrestomathy
>
> AFAIC, such a collection is *not* the place for optimal solutions -  
> that would be appropriate a benchmark (something that's probably  
> worth doing as well).  Accordingly, I'd encourage submitters to  
> minimize dependencies on external libraries (other than the toolkit  
> being demonstrated, obviously) and focus on clarity &  
> comprehensibility for new users.[0]
>
>> implementation - and I note that Glyph appears to regard the  
>> stated problem
>> as not really being concurrent.
>>
>> How should we extend the problem on the Wiki to be something that  
>> doesn't have
>> a workable serial solution?
>
> The particular problem (tail | grep) came out of Beaz's class and  
> was incredibly helpful for comparing generators vs. coroutines.  We  
> *should* find a problem that is actually concurrent - how about  
> tail|grep'ing multiple input files?
>
>> Does anyone have any suggestions of more realistic problems, or  
>> are we back at the level of Wide Finder?
>
>
> I don't see realism as the primary goal here - we could just use  
> tail & grep after all. ;-)  That said, ideas for reasonable  
> benchmarks would be helpful - thoughts?
>
> --Pete
>
> [0] I'm -0 on the use of time.sleep() & assuming input comes in  
> full lines._______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig