Help me use my Dual Core CPU!

Sun Oct 1 08:24:37 EDT 2006

Paul Rubin wrote:

> Michael <ms at cerenity.org> writes:
>> > But ordinary programmers write real-world applications with shared data
>> > all the time, namely database apps.
>> 
>> I don't call that shared data because access to the shared data is
>> arbitrated by a third party - namely the database. I mean where 2 or
>> more people[*] hold a lock on an object and share it - specifically
>> the kind of thing you reference above as turning into a mess.
> 
> Ehhh, I don't see a big difference between having the shared data
> arbitrated by an external process with cumbersome message passing,
> or having it arbitrated by an in-process subroutine or even by support
> built into the language.  If you can go for that, I think we agree on
> most other points.

The difference from my perspective is that there are two (not mutually
exclusive) options:
   A Have something arbitrate access and provide useful abstractions
     designed to simplify things for the user.
   B Use a lower level abstraction (eg built into the language, direct
     calling etc)

I don't see these as mutually exclusive, except the former is aimed at
helping the programmer, whereas the latter can be aimed at better
performance. In which case you're back to the same sort of argument
regarding assembler, compiled or dymanic languages are a good idea, and I'd
always respond with "depends on the problem in hand".

As for why you don't see much difference I can see why you think that,
but I personally believe that with A) you can shared best practice [1],
whereas B) means you need to be able to implement best practice.

   [1] Which is always an opinion :) (after all, once upon a time people
       thought goto was a good idea :)

>> > This is just silly, and wasteful of the
>> > efforts of the hardworking chip designers
> 
>> Aside from the fact it's enabled millions of programmers to deal with
>> shared data by communicating with a database?
> 
> Well, sure, but like spreadsheets, its usefulness is that it lets
> people get non-computationally-demanding tasks (of which there are a
> lot) done with relatively little effort.  More demanding tasks aren't
> so well served by spreadsheets, and lots of them are using databases
> running on massively powerful and expensive computers when they could
> get by with lighter weight communications mechanisms and thereby get
> the needed performance from much cheaper hardware.  That in turn would
> let normal folks run applications that are right now only feasible for
> relatively complex businesses.  If you want, I can go into why this is
> important far beyond the nerdy realm of software geekery.

I'd personally be interested to hear why you think that. I can think of
reasons myself, but would be curious to hear yours.

>> For generator based components we collapse inboxes into outboxes
>> which means all that's happening when someone puts a piece of data
>> into an outbox, they're simply saying "I'm no longer going to use
>> this", and the recipient can use it straight away.
> 
> But either you're copying stuff between processes, or you're running
> in-process without multiprocessor concurrency, right?

For generator components, that's in-process and not multiprocessor
concurrency, yes.

For threaded components we use Queue.Queues, which means essentially the
reference is copied for most real world data, not the data itself.

One step at a time I suppose really :-) One option for interprocess sharing
we're considering (since POSH looks unsupported, alpha, and untested on
recent pythons), is to use memory mapped files. Thing is that means
serialising everything which could be icky, so it'll have to be something
we come back to later.

(Much of our day to day work on Kamaelia is focussed on solving specific
problems for work which rolls back into fleshing out the toolkit. It would
be extremely nice to spend time on solving a particular issue that would
benefit from optimising interprocess comms).

If we can make kamaelia benefit from the work the hardware people have done
for shared memory, that's great. However it's interesting to see things
like the CELL don't tend to use shared memory, and use this style of
communications approach. What approach will be most useful going forward?
Dunno :-) I'm only claiming we find it useful :)

>> This is traditional-lock free,
> 
>> > Lately I've been reading about "software transactional memory" (STM),
> 
>> I've been hearing about it as well, but not digged into it....
>> If you do dig out those STM references, I'd be interested :-)
> 
> They're in the post you responded to:

Sorry, brain fart on my part. Thanks :-)

Michael.