The future of "frozen" types as the number of CPU cores increases

John Nagle nagle at animats.com
Tue Feb 16 15:15:05 EST 2010


    In the beginning, Python had some types which were "frozen", and some which
weren't.  Now, there's a trend towards having both "frozen" and "unfrozen"
versions of built-in types.  We now have "frozen" and "unfrozen" sets,
dictionaries, and byte arrays.  It's becoming clear that that the concept of 
"frozen" is separate from the type involved.

    Maybe "frozen" should be generalized, so that all "unfrozen" objects also
have "frozen" versions.

    I'd suggest
	
	p = freeze(q)

which would generate a "frozen" copy of q.  (For lists, this would
return a tuple.  For unfrozen sets, a frozen set.  Etc.)

	p = deepfreeze(q)

would generate "frozen" deep copy of q, for which each object it references
is also a "frozen" copy.

For objects,

	class c(frozenobject) :
		def __init__(self, someparam) :
			self.p = someparam
		...

would result in a class which returns frozen objects from the constructor.

    In concurrent programs, frozen objects can be shared without locking.
This simplifies locking considerably.  This may provide a way to get out
from the Global Interpreter Lock.  One possible implementation would be
to have unfrozen objects managed by reference counting and locking as
in CPython.  Frozen objects would live in a different memory space and be
garbage collected by a concurrent garbage collector.

    If we add "synchronized" objects with built-in locking, like Java,
threading becomes much cleaner.

	class d(synchronized) :
		...

A typical "synchronized" object would be a queue.  Anything with
state shared across threads has to be synchronized.  Only one
thread at a time can be active within a synchronized object,
due to a built-in implicit lock.

Semantics of synchronized objects:

	"Synchronized" objects cannot be frozen.  Trying to freeze
	them just returns the object.

	If a thread blocks on an explicit "wait", the synchronized
	object is temporarily unlocked during the "wait".  This
	allows threads to wait for, say, a queue to get another
	entry from another thread.

If only synchronized objects and frozen objects can be shared across
threads, the GIL becomes unnecessary.  Big performance win on multicore
CPUs.

    "Synchronized" was a pain in Java because Java doesn't have "frozen"
objects.  Too much effort went into synchronizing little stuff.  But
in a language with frozen objects, the little stuff would be frozen,
and wouldn't need its own locks.

    Threaded programs that already use queue objects to communicate with
each other are almost ready for this kind of architecture now.  If
the queue class made a frozen copy of any unfrozen, unsynchronized
copy placed on a queue, conversion to this approach could be almost
automatic.

    What would probably happen in practice is that potential race conditions in
existing programs would raise "unshareable object" exceptions,
indicating that something needed to be synchronized.  That's a good thing.
This approach makes threading much easier for the typical programmer.
Instead of race conditions and random errors, you get error messages.

    And we get rid of the GIL.

    I look at this as Python's answer to multicore CPUs and "Go".

					John Nagle



More information about the Python-list mailing list