[Baypiggies] Summary of June Exploiting Parallelism talk, pre-July part-2
jim
jim at well.com
Fri Jul 16 19:55:07 CEST 2010
For the June BayPIGgies meeting, Minesh B. Amin presented the
first of a two-part talk on Exploiting Parallelism. He will
present the second part of his talk at the July 22 BayPIGgies
meeting.
As a reminder and to whet appetites, here is Minesh's summary
of the June BayPIGgies talk.
-------------------------------------------
MBA Sciences, Inc (www.mbasciences.com)
Tel #: 650-938-4306
Email: mamin at mbasciences.com
Topic: Exploiting Parallelism : A Concise and Practical Introduction
Speaker: Minesh B. Amin, Founder and CEO, MBA Sciences, Inc.
In his book "In Search of Clusters", Gregory Pfister said it best.
To paraphrase, there are three ways to do anything faster: work
harder, work smarter, or get help. In computer-speak, this roughly
translates to: increase processor speed, improve algorithms, or
exploit parallelism.
With processor speeds no longer doubling every eighteen months and
little or no room left for improvements in serial algorithms,
exploiting parallelism is the one frontier with the potential for
delivering huge improvements in performance.
In the June 2010 BayPIGgies technical program, we presented our
take on the why, what, and how on exploiting parallelism.
Preamble
--------
Bracketed by the Groucho Marx quote, "Before I speak, I have
something important to say", we started off by making a rather
astonishing observation about this, approximately 60 year old,
field of parallel software engineering; namely, the lack of
consensus on the answers to the most fundamental questions in
the field ... including ones implied by the title of the talk:
* What do we mean by exploiting parallelism?
* How does "exploiting parallelism" differ from
"parallel programming"?
"Exploiting parallelism": Why?
------------------------------
When it comes to leveraging modern computer systems, the challenge
of our time is not making existing serial engineering talent
proficient in parallel programming. Rather, the challenge of our
time is to exploit parallelism in a way that leverages the existing
predominantly serial talent and serial development processes.
Stated another way ... while the very nature of hardware systems
has changed from mainly serial to mainly parallel, our supposition
is that serial software engineers will continue to play a dominant
role going forward.
Hence, we emphasize the why/what/how of thinking about leveraging
modern hardware systems in a way that allows serial engineers to
focus on the serial component of a parallel application, which is
the why/what/how of exploiting parallelism.
In a way, exploiting parallelism is analogous to being able to
drive a car without getting bogged down in the details of how the
engine (parallel component of parallel application) works. The
driver only needs to be able to relate to the API of a car, which
is close to intent, easy to relate to, and hides the details of
how the engine works.
"Exploiting parallelism": What?
-------------------------------
Exploiting parallelism involves nothing more than the management of
a collection of serial tasks.
This simple, fourteen word, definition can be applied and used to
analyze any parallel solution.
"Management" includes policies by which tasks are scheduled,
premature terminations are handled, preemptive support is provided,
communication primitives are enabled, and cores are obtained and
released.
Serial tasks come in two flavors:
* Coarse grained tasks ... which may not communicate prior to
conclusion
* Fine grained tasks ... which may communicate prior to conclusion
Consider an example where a group of painters are assigned sections
of a wall to paint blue. They don't need to communicate with the
project manager before they finish ... an example of coarse grained
parallelism.
In contrast, when assigned sections of a complex image to paint as a
mural, communication is required to ensure that the more complex
images are aligned properly.
(The talk also discussed the trade-offs of authoring and
maintaining serial components of a parallel application on
distributed and shared memory hardware systems.)
"Exploiting parallelism": How?
------------------------------
Armed with our one definition, several parallel enabling
technologies were discussed and reviewed, including OpenMPI,
Python multi-processing module, MapReduce, OpenMP, and the
parallel component of SPM.Python. Exchanges between the speaker
and audience touched on basic descriptions of the solutions and
their characteristics.
For example,the fact that MapReduce is easy to relate to can be
attributed to the fact that its API is close to the intent of
the serial software engineer. On the other hand, while OpenMPI
and OpenMP provide a rich set of communication primitives, they
lack any notion of management ... a gap that must be filled,
as per our definition, by the serial software engineers. But
that gap requires one to be proficient at parallel programming.
"Exploiting parallelism": Q&A
-----------------------------
The technical program closed with a spirited and engaging Q&A
session, touching on many examples, descriptions and analogies
covering the parallel solutions reviewed and their
characteristics, using the definition of exploiting parallelism
presented.
-------------------------------------------
The July 22 Talk is titled A Technical Anatomy of SPM.Python
(A Scalable, Parallel Version of the Python Language), by
Minesh B. Amin
Spm.Python, a commercial product, extends Python with a programming
paradigm for solving parallel problems and strives to do so in a
pythonic (natural) way by augmenting the serial Python language with
parallel concepts like parallel task managers and communication
primitives.
More information about the Baypiggies
mailing list