New subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified thread usage, and a path to getting rid of the GIL

June 28, 2010

      I'm moving this thread to python-ideas, where it belongs.

I've looked at the implementation code (even stepped through it with
pdb!), read the sample/test code, and read the two papers on
animats.com fairly closely (they have a lot of overlap, and the memory
model described below seems copied verbatim from
http://www.animats.com/papers/languages/pythonconcurrency.html version
0.8).

Some reactions (trying to hide my responses to the details of the code):

- First of all, I'm very happy to see radical ideas proposed, even if
they are at present unrealistic. We need a big brainstorm to come up
with ideas from which an eventual solution to the multicore problem
might be chosen. (Jesse Noller's multiprocessing is another; Adam
Olsen's work yet another, at a different end of the spectrum.)

- The proposed new semantics (frozen objects, memory model,
auto-freezing of globals, enforcement of naming conventions) are
radically different from Python's current semantics. They will break
every 3rd party library in many more ways than Python 3. This is not
surprising given the goals of the proposal (and its roots in Adam
Olsen's work) but places a huge roadblock for acceptance. I see no
choice but to keep trying to come up with a compromise that is more
palatable and compatible without throwing away all the advantages. As
it now stands, the proposal might as well be a new and different
language.

- SynchronizedObject looks like a mixture of a Java synchronized class
(a non-standard concept in Java but easily understood as a class all
whose public methods are synchronized) and a condition variable (which
has the same semantics of releasing the lock while waiting but without
crawling the stack for other locks to release). It looks like the
examples showing off SynchronizedObject could be implemented just as
elegantly using a condition variable (and voluntary abstention from
using shared mutable objects).

- If the goal is to experiment with new control structures, I
recommend decoupling them from the memory model and frozen objects,
instead relying (as is traditional in Python) on programmer caution to
avoid races. This would make it much easier to see how programmers
respond to the new control structures.

- You could add the freeze() function for voluntary use, and you could
even add automatic wrapping of arguments and return values for certain
classes using a class decorator or a metaclass, but the performance
overhead makes this unlikely to win over many converts. I don't see
much use for the "whole program freezing" done by the current
prototype -- there are way too many backdoors in Python for the
prototype approach to be anywhere near foolproof, and if we want a
non-foolproof approach, voluntary constraint (and, in some cases,
voluntary, i.e. explicit, wrapping of modules or classes) would work
just as well.

- For a larger-scale experiment with the new memory model and semantic
restrictions (or would it be better to call them syntactic
restrictions? -- after all they are about statically detectable
properties like naming conventions) I recommend looking at PyPy, which
has as one of its explicitly stated project goals easy experimentation
with different object models.

- I'm sure I've forgotten something, but I wanted to keep my impressions fresh.

- Again, John, thanks for taking the time to come up with an
implementation of your idea!

--Guido

On Sat, Jun 26, 2010 at 9:39 AM, John Nagle <nagle@animats.com> wrote:
...
On 6/26/2010 7:44 AM, Jesse Noller wrote:
...
On Sat, Jun 26, 2010 at 9:29 AM, Michael Foord
<fuzzyman@voidspace.org.uk>  wrote:
...
On 26/06/2010 07:11, John Nagle wrote:
...
We have just released a proof-of-concept implementation of a new
approach to thread management - "newthreading".
....
...
...
The import * form is considered bad practise in *general* and
should not be recommended unless there is a good reason.
  I agree.  I just did that to make the examples cleaner.
...
...
however the introduction of free-threading in Python has not been
hampered by lack of synchronization primitives but by the
difficulty of changing the interpreter without unduly impacting
single threaded code.
   That's what I'm trying to address here.
...
...
Providing an alternative garbage collection mechanism other than
reference counting would be a more interesting first-step as far as
I can see, as that removes the locking required around every access
to an object (which currently touches the reference count).
Introducing free-threading by *changing* the threading semantics
(so you can't share non-frozen objects between threads) would not
be acceptable. That comment is likely to be based on a
misunderstanding of your future intentions though. :-)
   This work comes out of a discussion a few of us had at a restaurant
in Palo Alto after a Stanford talk by the group at Facebook which
is building a JIT compiler for PHP.  We were discussing how to
make threading both safe for the average programmer and efficient.
Javascript and PHP don't have threads at all; Python has safe
threading, but it's slow.  C/C++/Java all have race condition
problems, of course.  The Facebook guy pointed out that you
can't redefine a function dynamically in PHP, and they get
a performance win in their JIT by exploiting this.
   I haven't gone into the memory model in enough detail in the
technical paper.  The memory model I envision for this has three
memory zones:
   1.  Shared fully-immutable objects: primarily strings, numbers,
and tuples, all of whose elements are fully immutable.  These can
be shared without locking, and reclaimed by a concurrent garbage
collector like Boehm's.  They have no destructors, so finalization
is not an issue.
   2.  Local objects.  These are managed as at present, and
require no locking.  These can either be thread-local, or local
to a synchronized object.  There are no links between local
objects under different "ownership".  Whether each thread and
object has its own private heap, or whether there's a common heap with
locks at the allocator is an implementation decision.
   3.  Shared mutable objects: mostly synchronized objects, but
also immutable objects like tuples which contain references
to objects that aren't fully immutable.  These are the high-overhead
objects, and require locking during reference count updates, or
atomic reference count operations if supported by the hardware.
The general idea is to minimize the number of objects in this
zone.
   The zone of an object is determined when the object is created,
and never changes.   This is relatively simple to implement.
Tuples (and frozensets, frozendicts, etc.) are normally zone 2
objects.  Only "freeze" creates collections in zones 1 and 3.
Synchronized objects are always created in zone 3.
There are no difficult handoffs, where an object that was previously
thread-local now has to be shared and has to acquire locks during
the transition.
   Existing interlinked data structures, like parse trees and GUIs,
are by default zone 2 objects, with the same semantics as at
present.  They can be placed inside a SynchronizedObject if
desired, which makes them usable from multiple threads.
That's optional; they're thread-local otherwise.
   The rationale behind "freezing" some of the language semantics
when the program goes multi-thread comes from two sources -
Adam Olsen's Safethread work, and the acceptance of the
multiprocessing module.  Olsen tried to retain all the dynamism of
the language in a multithreaded environment, but locking all the
underlying dictionaries was a boat-anchor on the whole system,
and slowed things down so much that he abandoned the project.
The Unladen Swallow documentation indicates that early thinking
on the project was that Olsen's approach would allow getting
rid of the GIL, but later notes indicate that no path to a
GIL-free JIT system is currently in development.
   The multiprocessing module provides semantics similar to
threading with "freezing".  Data passed between processes is "frozen"
by pickling.  Processes can't modify each other's code.  Restrictive
though the multiprocessing module is, it appears to be useful.
It is sometimes recommended as the Pythonic approach to multi-core CPUs.
This is an indication that "freezing" is not unacceptable to the
user community.
   Most of the real-world use cases for extreme dynamism
involve events that happen during startup.  Configuration files are
read, modules are selectively included, functions are overridden, tables
of references to functions are set up, regular expressions are compiled,
and the code is brought into the appropriately configured state.  Then
the worker threads are started and the real work starts. The
"newthreading" approach allows all that.
   After two decades of failed attempts remove the Global
Interpreter Lock without making performance worse, it is perhaps
time to take a harder look at scaleable threading semantics.
                                       John Nagle
                                       Animats
-- 
--Guido van Rossum (python.org/~guido)

Re: [Python-ideas] [Python-Dev] [ANN]: "newthreading" - an approach to simplified thread usage, and a path to getting rid of the GIL

Guido van Rossum

Antoine Pitrou

Michael Foord

Nick Coghlan

Michael Foord

Adam Olsen

tags

participants (5)