[concurrency] Common Concurrent Problems
Jeremy McMillan
jeremy.mcmillan at gmail.com
Wed Jun 10 17:29:03 CEST 2009
I think we should sample a range of parallel problems so we can
benchmark implementations based on some objective criteria.
Concurrency problems exist on a scale that ranges from
"embarrassingly parallel" to "inherently sequential" and can be
benchmarked by Amdahl's Law: Some tasks can not be computed in
parallel, and the minimum time to compute a parallel task depends on
the maximal serially computed component. This is an ideal that will
allow objective scoring the "concurrency potential" of a given
problem with Big-O notation.
http://en.wikipedia.org/wiki/Embarrassingly_parallel
http://en.wikipedia.org/wiki/Amdahl%27s%5Flaw
[notice the urlencoded title in the URL looks like Amdahl's Flaw]
In reality, doing computing in parallel comes with two overhead
factors. First you have to distribute the workload. That means
setting up each parallel execution context and dealing out the
workload. Second, you have to coalesce (some of) the computed
products of those parallel operations. Besides benchmarking
performance, any given approach/tool/API may require different levels
of effort to implement/read/debug those two chunks.
So I guess what I'm suggesting is a three dimensional matrix of
benchmarks: x Frameworks/APIs/Tools, y representative sample problems
on the scale of concurrency potential, and z performance measurements.
On Jun 9, 2009, at 9:39 AM, Pete wrote:
> On Jun 8, 2009, at 5:10 PM, Paul Boddie wrote:
>
> On http://wiki.python.org/moin/Concurrency/99Bottles
>
>> even made a remark on the "99 Concurrent Bottles of Beer" page
>> which appears
>> to have been wide of the mark - that one would surely try and use
>> operating
>> system features in the given example in order to provide a more
>> optimal
>
> To clarify: my objective with this page is to give potential users
> a general sense of what code using a variety of toolkits/libraries/
> techniques looks like. The technical term for such a collection is
> a chrestomathy: "a collection of similar programs written in
> various programming languages, for the purpose of demonstrating
> differences in syntax, semantics and idioms for each language"
> http://en.wikipedia.org/wiki/Program_Chrestomathy
>
> AFAIC, such a collection is *not* the place for optimal solutions -
> that would be appropriate a benchmark (something that's probably
> worth doing as well). Accordingly, I'd encourage submitters to
> minimize dependencies on external libraries (other than the toolkit
> being demonstrated, obviously) and focus on clarity &
> comprehensibility for new users.[0]
>
>> implementation - and I note that Glyph appears to regard the
>> stated problem
>> as not really being concurrent.
>>
>> How should we extend the problem on the Wiki to be something that
>> doesn't have
>> a workable serial solution?
>
> The particular problem (tail | grep) came out of Beaz's class and
> was incredibly helpful for comparing generators vs. coroutines. We
> *should* find a problem that is actually concurrent - how about
> tail|grep'ing multiple input files?
>
>> Does anyone have any suggestions of more realistic problems, or
>> are we back at the level of Wide Finder?
>
>
> I don't see realism as the primary goal here - we could just use
> tail & grep after all. ;-) That said, ideas for reasonable
> benchmarks would be helpful - thoughts?
>
> --Pete
>
> [0] I'm -0 on the use of time.sleep() & assuming input comes in
> full lines._______________________________________________
> concurrency-sig mailing list
> concurrency-sig at python.org
> http://mail.python.org/mailman/listinfo/concurrency-sig
More information about the concurrency-sig
mailing list