[Baypiggies] Summary of June Exploiting Parallelism talk, pre-July part-2

jim jim at well.com
Fri Jul 16 19:55:07 CEST 2010




   For the June BayPIGgies meeting, Minesh B. Amin presented the 
first of a two-part talk on Exploiting Parallelism. He will 
present the second part of his talk at the July 22 BayPIGgies 
meeting. 
   As a reminder and to whet appetites, here is Minesh's summary 
of the June BayPIGgies talk. 


-------------------------------------------


MBA Sciences, Inc (www.mbasciences.com)
Tel #: 650-938-4306
Email: mamin at mbasciences.com

Topic:   Exploiting Parallelism : A Concise and Practical Introduction
Speaker: Minesh B. Amin, Founder and CEO, MBA Sciences, Inc.

In his book "In Search of Clusters", Gregory Pfister said it best. 
To paraphrase, there are three ways to do anything faster: work 
harder, work smarter, or get help. In computer-speak, this roughly 
translates to: increase processor speed, improve algorithms, or 
exploit parallelism.
        
With processor speeds no longer doubling every eighteen months and
little or no room left for improvements in serial algorithms, 
exploiting parallelism is the one frontier with the potential for
delivering huge improvements in performance.

In the June 2010 BayPIGgies technical program, we presented our 
take on the why, what, and how on exploiting parallelism.


Preamble
--------

Bracketed by the Groucho Marx quote, "Before I speak, I have
something important to say", we started off by making a rather
astonishing observation about this, approximately 60 year old,
field of parallel software engineering; namely, the lack of 
consensus on the answers to the most fundamental questions in
the field ... including ones implied by the title of the talk:
   * What do we mean by exploiting parallelism?
   * How does "exploiting parallelism" differ from 
     "parallel programming"?


"Exploiting parallelism": Why?
------------------------------

When it comes to leveraging modern computer systems, the challenge 
of our time is not making existing serial engineering talent 
proficient in parallel programming. Rather, the challenge of our 
time is to exploit parallelism in a way that leverages the existing
predominantly serial talent and serial development processes.

Stated another way ... while the very nature of hardware systems
has changed from mainly serial to mainly parallel, our supposition
is that serial software engineers will continue to play a dominant
role going forward. 

Hence, we emphasize the why/what/how of thinking about leveraging
modern hardware systems in a way that allows serial engineers to 
focus on the serial component of a parallel application, which is 
the why/what/how of exploiting parallelism.

In a way, exploiting parallelism is analogous to being able to 
drive a car without getting bogged down in the details of how the 
engine (parallel component of parallel application) works. The 
driver only needs to be able to relate to the API of a car, which 
is close to intent, easy to relate to, and hides the details of 
how the engine works.


"Exploiting parallelism": What?
-------------------------------

Exploiting parallelism involves nothing more than the management of 
a collection of serial tasks. 

This simple, fourteen word, definition can be applied and used to 
analyze any parallel solution.
        
"Management" includes policies by which tasks are scheduled, 
premature terminations are handled, preemptive support is provided, 
communication primitives are enabled, and cores are obtained and 
released.

Serial tasks come in two flavors:
   * Coarse grained tasks ... which may not communicate prior to
     conclusion
   * Fine grained tasks ... which may communicate prior to conclusion

Consider an example where a group of painters are assigned sections 
of a wall to paint blue. They don't need to communicate with the 
project manager before they finish ... an example of coarse grained
parallelism.

In contrast, when assigned sections of a complex image to paint as a 
mural, communication is required to ensure that the more complex 
images are aligned properly.
        
(The talk also discussed the trade-offs of authoring and 
maintaining serial components of a parallel application on 
distributed and shared memory hardware systems.)


"Exploiting parallelism": How?
------------------------------

Armed with our one definition, several parallel enabling 
technologies were discussed and reviewed, including OpenMPI, 
Python multi-processing module, MapReduce, OpenMP, and the 
parallel component of SPM.Python. Exchanges between the speaker 
and audience touched on basic descriptions of the solutions and 
their characteristics. 

For example,the fact that MapReduce is easy to relate to can be 
attributed to the fact that its API is close to the intent of 
the serial software engineer. On the other hand, while OpenMPI 
and OpenMP provide a rich set of communication primitives, they 
lack any notion of management ... a gap that must be filled, 
as per our definition, by the serial software engineers. But 
that gap requires one to be proficient at parallel programming. 


"Exploiting parallelism": Q&A 
-----------------------------

The technical program closed with a spirited and engaging Q&A 
session, touching on many examples, descriptions and analogies 
covering the parallel solutions reviewed and their 
characteristics, using the definition of exploiting parallelism 
presented.


-------------------------------------------


The July 22 Talk is titled A Technical Anatomy of SPM.Python 
(A Scalable, Parallel Version of the Python Language), by 
Minesh B. Amin 

Spm.Python, a commercial product, extends Python with a programming 
paradigm for solving parallel problems and strives to do so in a 
pythonic (natural) way by augmenting the serial Python language with 
parallel concepts like parallel task managers and communication 
primitives. 





More information about the Baypiggies mailing list