multi-core software

Thu Jun 4 12:46:44 EDT 2009

Of interest:

• Why Must Software Be Rewritten For Multi-Core Processors?
  http://xahlee.org/UnixResource_dir/writ/multi-core_software.html

plain text version follows.

--------------------------------------------------
Why Must Software Be Rewritten For Multi-Core Processors?

Xah Lee, 2009-06-04

I had a revelation today, namely, that it is necessary to rewrite
software to use multi-processor in order to benefit from it.

This may sound stupid, but is a revelation to me. For the past decade,
the question has been on my mind, about why should software needs to
be rewritten to take advantage of multi-processors. Because, in my
mind, i thought that software are at some fundamental level just
algorithms, and algorithms, have nothing to do with hardware
implementation aspects such as number of processors. I always felt,
that those talks about the need or difficulty of rewriting software
for multi-processor (or multi-core these days) must be the product of
idiocy of industrial imperative coding monkies. In particular, some
languages such as java, the way they deal with it, seems to me
extremely stupid. e.g. the concept of threads. In my mind, there
should be a layer between the software and the hardware, such as the
operating system, or the compiler, that should be able to
automatically handle or compile the software so that it FULLY use the
multi-processors when present. In short, i thought that a algorithm
really has nothing to do with hardware details.

I never really thought hard about this issue, but today, since i got a
quad-core PC, so i looked into the issue, and thought about it, and i
realized the answer. The gist is that, algorithm, fundamentally means
manipulating some hardware, in fact, algorithm is a step by step
instruction about some form of hardware, even the hardware may be
abstract or virtual. For example, let's say the simplest case of 1+1.
It is a algorithm, but where is the hardware? You may say it's
abstract concept, or it being a mathematical model. What you call 1+1
depends on the context, but in our context, those numbers are the
hardware. To see this, lets say more complex example of listing primes
by sieve. Again, it comes down to “what is a number”? Here, numbers
can be stones, or arrangement of beads on abacus, it's hardware! As
another example, say sorting. To begin with, you have to have some
something to sort, that's hardware.

Another way to see this is that, for a given computing problem, there
are infinite number of algorithms to achieve the same thing. Some,
will be better ones, requiring less steps, or requiring less storage.
All these are concrete manipulation issues, and the thing being
manipulated, ultimately we have to call it hardware. So, when hardware
changes, say from one cpu to multi-cpu, there's no way for the
algorithm to magically change and adopt the changed hardware. If you
need a algorithm that is catered to the new hardware, you need a new
algorithm.

One might think that there might be algorithm Omega, such that it
takes input of old hardware O, new hardware N, and a algorithm A, and
output a algorithm B, such that B has the same behavior as A, but N+B
performs better than O+A. This is like asking for Strong AI.

One thing we often forgot is that algorithms ultimately translates to
manipulating hardware. In a modern digital computer, that means
software algorithms ultimately becomes machine instructions in CPU,
which manipulate the 1s and 0s in register, or electricity voltage in
transisters.

In a more mundane point of view, a automatic system for software to
work on multi-processors is a problem of breaking a problem into
discrete units (so that they can be computed in parallel). The problem
of finding a algorithm is entirely different from the problem of
breaking a problem into distinct units. The problem of breaking a
problem into distinct units is a entire new branch of mathematics. For
example, let's say factoring. Factoring is a well defined mathematical
problem. There are millions algorithms to do it, each class has
different properties such as number of steps or storage units.
However, if we look at these algorithms from the point of view of
distinct units, it's a new perspective on classification of
algorithms. Software are in general too ill-defined and fuzzy and
complex, the software we use on personal computers such as email,
browsers, games, don't even have mathematical models. They don't even
have mathematical models of their inputs and outputs. To talk about
automatic system of unitizing software, would be more like a AI
fantasy. Roughly, a term that describes this aspect of research is
Parallel computing.

In the Wikipedia article, it talks about types of parallelism: Bit-
level parallelism, Instruction-level parallelism, Data parallelism,
Task parallelism. Then it also discusses hardware aspects classes:
multicore, symmetric multiprocessing, distributed computing, cluster,
grid. The subjects mentioned there more close to this essay are “data
parallelism” and “task parallelism”. However, neither are high level
enough as i discussed here. The issue discussed here would be like
“algorithmic parallelism”. Indeed, Wikipedia mentioned “Automatic
parallelization”, which is exactly what i'm talking about here. Quote:

    Automatic parallelization of a sequential program by a compiler is
the holy grail of parallel computing. Despite decades of work by
compiler researchers, automatic parallelization has had only limited
success.[40]

    Mainstream parallel programming languages remain either explicitly
parallel or (at best) partially implicit, in which a programmer gives
the compiler directives for parallelization. A few fully implicit
parallel programming languages exist — SISAL, Parallel Haskell, and
(for FPGAs) Mitrion-C.

It says “A few fully implicit parallel programming languages exist”.
If you are a comp lang nutcase, don't get carried away by what those
words might seem to mean.

(Note: Wikipedia has a dedicated article on the subject: Automatic
parallelization)

  Xah
∑ http://xahlee.org/

☄