[Python-Dev] Slides from today's parallel/async Python talk
trent at snakebite.org
Thu Apr 4 22:04:41 CEST 2013
On Thu, Apr 04, 2013 at 01:18:58AM -0700, Charles-François Natali wrote:
> Just a quick implementation question (didn't have time to read through
> all your emails :-)
> async.submit_work(func, args, kwds, callback=None, errback=None)
> How do you implement arguments passing and return value?
> e.g. let's say I pass a list as argument: how do you iterate on the
> list from the worker thread without modifying the backing objects for
> refcounts (IIUC you use a per-thread heap and don't do any
Correct, nothing special is done for the arguments (apart from
incref'ing them in the main thread before kicking off the parallel
thread (then decref'ing them in the main thread once we're sure the
parallel thread has finished)).
> Same thing for return value, how do you pass it to the
For submit_work(), you can't :-) In fact, an exception is raised if
the func() or callback() or errback() attempts to return a non-None
It's worth noting that I eventually plan to have the map/reduce-type
functionality (similar to what multiprocessing offers) available via
a separate 'parallel' façade. This will be geared towards programs
that are predominantly single-threaded, but have lots of data that
can be processed in parallel at various points.
Now, with that being said, there are a few options available at the
moment if you want to communicate stuff from parallel threads back
to the main thread. Originally, you could do something like this:
d = async.dict()
d['foo'] = async.rdtsc()
d['bar'] = async.rdtsc()
But I recently identified a few memory-management flaws with that
approach (I'm still on the fence with this issue... initially I was
going to drop all support, but I've since had ideas to address the
memory issues, so, we'll see).
There's also this option:
d = dict()
def store(k, v):
d[str(k)] = str(v)
(Not a particularly performant option though; the main-thread
instantly becomes the bottleneck.)
Post-PyCon, I've been working on providing new interlocked data
types that are specifically designed to bridge the parallel/main-
xl = async.xlist()
x = xl.pop()
if not x:
What's interesting about xlist() is that it takes ownership of the
parallel objects being pushed onto it. That is, it basically clones
them, using memory allocated from its own internal heap (allowing
the parallel-thread's context heap to be freed, which is desirable).
The push/pop operations are interlocked at the C level, which
obviates the need for any explicit locking.
I've put that work on hold for now though; I want to finish the
async client/server stuff (it's about 60-70% done) first. Once
that's done, I'll tackle the parallel.*-type façade.
More information about the Python-Dev