Mailman 3 New PEP: 319 - Python-Dev

New PEP: 319

michel＠dialnetwork.com

15 Jun 2003 15 Jun '03

12:29 p.m.

Greetings, After doing a whole heck of a lot of Java and Jython programming over the last year I decided to work an idea of mine into a PEP after being impressed with Java thread syncronization and frustrated with Python (it's almost always the other way around...) http://www.python.org/peps/pep-0319.html Comments, please send to me. I think python-dev is the right forum for discussion, otherwise someone will surely let me know and I'll go to python-list. Thanks! -Michel

Show replies by date

Michel Pelletier

15 Jun 15 Jun

12:40 p.m.

...

Greetings,

After doing a whole heck of a lot of Java and Jython programming over the last year I decided to work an idea of mine into a PEP after being impressed with Java thread syncronization and frustrated with Python (it's almost always the other way around...)

http://www.python.org/peps/pep-0319.html

Oops I found an error. In the 'asynchronize' keyword section the last block of code should be: synchronize: while in_loop(): change_shared_data() asynchronize: do_blocking_io() change_shared_data2() not: while in_loop(): synchronize: change_shared_data() asynchronize: do_blocking_io() change_shared_data2() I've sent in a new revision. -Michel

martin＠v.loewis.de

1:50 p.m.

michel@dialnetwork.com writes:

...

Comments, please send to me. I think python-dev is the right forum for discussion, otherwise someone will surely let me know and I'll go to python-list.

I find this underspecified. The section that says "Implementation" really tries to explain what the *semantics* is of the proposed keywords, yet it fails to spell out many interesting details. For example, the PEP nowhere says what the semantics of the "synchronize" keyword is. Apparently, execution may stop when entering the synchronize block under certain circumstances. But under what circumstances? One may interpret that the example without synchronize is meant to do the same thing as the code with synchronize, but this does not help much, since I don't know what the "acquire_lock" and "release_lock" global functions are. Also, when talking about targets, I notice that these are expressions. I assume that synchronization only happens when the "same" object is used twice. What kind of "sameness" does that assume? Are there really no restrictions? e.g. would def foo(): synchronize "a"+"b": print 1 def bar(): synchronize "a"+"b": print 2 be valid? What would be the meaning of this code? Regards, Martin

andrew cooke

2:54 p.m.

there is also discussion at the moment of thread-synchronisation on the stackless python list. people were considering ideas related to futures and csp (influenced largely by the oz language, i think). maybe stackless, with its stronger emphasis on threads, is the place to iron out a really good solution to multi-threading before making changes to standard python? personal opinion: while java may be better than python in this respect i think there are much better solutions out there. i'm a java programmer and in my last project, which was multi-threaded, most bugs came from threading issues. there seems to be a lot of research and new ideas in this area and it would be a pity if python only matched java, when there may be the possibility to surpass it... andrew michel@dialnetwork.com writes:

...

Greetings,

After doing a whole heck of a lot of Java and Jython programming over the last year I decided to work an idea of mine into a PEP after being impressed with Java thread syncronization and frustrated with Python (it's almost always the other way around...)

http://www.python.org/peps/pep-0319.html

Comments, please send to me. I think python-dev is the right forum for discussion, otherwise someone will surely let me know and I'll go to python-list.

Thanks!

-Michel

_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev

-- http://www.acooke.org

Guido van Rossum

6:30 p.m.

...

there is also discussion at the moment of thread-synchronisation on the stackless python list. people were considering ideas related to futures and csp (influenced largely by the oz language, i think).

If someone could summarize those ideas here, that would be great (I have no time to read the oz reference manual, alas).

...

maybe stackless, with its stronger emphasis on threads, is the place to iron out a really good solution to multi-threading before making changes to standard python?

Um, Stackless has a very different notion of threads than core Python. Stackless threads are non-pre-emptive and cannot be used for overlapping I/O, I believe (at least not easily).

...

personal opinion: while java may be better than python in this respect i think there are much better solutions out there. i'm a java programmer and in my last project, which was multi-threaded, most bugs came from threading issues.

In any project that is multi-threaded, most bugs will come from threading issues. This is regardless of programming language -- it's a deep, as yet ununderstood property of threads. --Guido van Rossum (home page: http://www.python.org/~guido/)

kbk＠shore.net

6:55 p.m.

Guido van Rossum writes:

...

In any project that is multi-threaded, most bugs will come from threading issues. This is regardless of programming language -- it's a deep, as yet ununderstood property of threads.

I'll second that. They require a different and paranoid viewpoint :-\ They're hard to debug and it's difficult to assure good coverage when testing. I've also seen new problems arise when switching to a faster processor. __ KBK

Michel Pelletier

11:30 p.m.

...

there is also discussion at the moment of thread-synchronisation on the stackless python list. people were considering ideas related to futures and csp (influenced largely by the oz language, i think).

Thanks for the tip, I'll check those out.

...

personal opinion: while java may be better than python in this respect i think there are much better solutions out there.

A better solution to synchronize threads, or a better solution to concurrency in general? Would this oz language be something like that? -Michel

Aahz

6:40 p.m.

On Sun, Jun 15, 2003, michel@dialnetwork.com wrote:

...

After doing a whole heck of a lot of Java and Jython programming over the last year I decided to work an idea of mine into a PEP after being impressed with Java thread syncronization and frustrated with Python (it's almost always the other way around...)

http://www.python.org/peps/pep-0319.html

You need to be *much* clearer about the proposed interface between the ``synchronize`` keyword and Python objects. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra

Guido van Rossum

6:58 p.m.

...

On Sun, Jun 15, 2003, michel@dialnetwork.com wrote:

...
After doing a whole heck of a lot of Java and Jython programming over the last year I decided to work an idea of mine into a PEP after being impressed with Java thread syncronization and frustrated with Python (it's almost always the other way around...)

http://www.python.org/peps/pep-0319.html

[Aahz]

...

You need to be *much* clearer about the proposed interface between the ``synchronize`` keyword and Python objects.

I agree with Aahz; especially the scope of the lock used by an anonymous synchronize block is ambiguous in the current PEP. In one example it appears that there is a lock associated with each unqualified use of the synchronize keyword; in another, it seems that unqualified uses in the same class share a lock. Please try to explain the semantics of named and unnamed synchronize calls entirely in terms of code that would work in current Python, without using English (other than "this code is equivalent to that code"). I'd also like to see how 'asynchronize' works with condition variables, which seem to be the most common use for temporarily unlocking. (Your example of how code would do this without asynchronize has a bug, by the way; if the I/O operation raises an exception, the finally clause will attempt to release an already released lock.) I think the PEP would be clearer if it was considerably shorter and to the point, with fewer examples and a more exact specification. --Guido van Rossum (home page: http://www.python.org/~guido/)

Aahz

9:08 p.m.

On Sun, Jun 15, 2003, Guido van Rossum wrote:

...

[Aahz]

...
On Sun, Jun 15, 2003, michel@dialnetwork.com wrote:

...
After doing a whole heck of a lot of Java and Jython programming over the last year I decided to work an idea of mine into a PEP after being impressed with Java thread syncronization and frustrated with Python (it's almost always the other way around...)

http://www.python.org/peps/pep-0319.html

You need to be *much* clearer about the proposed interface between the ``synchronize`` keyword and Python objects.

I agree with Aahz; especially the scope of the lock used by an anonymous synchronize block is ambiguous in the current PEP. In one example it appears that there is a lock associated with each unqualified use of the synchronize keyword; in another, it seems that unqualified uses in the same class share a lock.

Please try to explain the semantics of named and unnamed synchronize calls entirely in terms of code that would work in current Python, without using English (other than "this code is equivalent to that code").

It occurs to me that my comment was actually insufficiently clear: what I mean by "interface" is, "What methods get called on which Python objects?" In particular, read closely the documentation on such things as iterators (and the way they work with ``for`` loops) and the sequence/mapping protocol. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "If you don't know what your program is supposed to do, you'd better not start writing it." --Dijkstra

Michel Pelletier

11:24 p.m.

...

...
You need to be *much* clearer about the proposed interface between the ``synchronize`` keyword and Python objects.

I agree with Aahz; especially the scope of the lock used by an anonymous synchronize block is ambiguous in the current PEP. In one example it appears that there is a lock associated with each unqualified use of the synchronize keyword; in another, it seems that unqualified uses in the same class share a lock.

Please try to explain the semantics of named and unnamed synchronize calls entirely in terms of code that would work in current Python, without using English (other than "this code is equivalent to that code").

Here is a (hopefully) clearer snippet I am working on in the revision: class SynchronizedCounter: def __init__(self): self.counter = 0 self.counter_lock = thread.allocate_lock() def increment(self): self.counter_lock.acquire() try: self.counter += 1 finally: self.counter_lock.release() in my mind I wanted to replace with: class SynchronizedCounter: def __init__(self): self.counter = 0 def increment(self): synchronize: self.counter += 1 Is your question, What is the unqualified lock associated with, the instance, the class, the method, the counter, or something else? If it is your question then the answer is I'm not sure now that I've thought about it deeper. Clearly the concept of the synchronization is around the counter, although now I can see no way to associate that implicitly. Perhaps this is why Java does not have unqualified synchronized blocks and maybe I should remove it in which case the previous code would be: class SynchronizedCounter: def __init__(self): self.counter = 0 def increment(self): synchronize self.counter: self.counter += 1

...

I'd also like to see how 'asynchronize' works with condition variables, which seem to be the most common use for temporarily unlocking.

I will look into that.

...

(Your example of how code would do this without asynchronize has a bug, by the way; if the I/O operation raises an exception, the finally clause will attempt to release an already released lock.)

Yes I need another try/finally in there. Thanks.

...

I think the PEP would be clearer if it was considerably shorter and to the point, with fewer examples and a more exact specification.

Thanks for the advice, I will move my thinking in this direction. -Michel

Roman Suzi

11:22 p.m.

On Mon, 16 Jun 2003, Michel Pelletier wrote:

...

def __init__(self): self.counter = 0

def increment(self): synchronize: self.counter += 1

What about just adding a parameter to try operator? def increment(self): try self.counter_lock: self.counter += 1 - this will save from keyword and could also be used with except part with, for example, se-syncronization outcomes for some kinds of locks: def increment(self): try self.counter_lock: self.counter += 1 except SomeLockError: bla-bla-bla

...

Is your question, What is the unqualified lock associated with, the instance, the class, the method, the counter, or something else? If it is your question then the answer is I'm not sure now that I've thought about it deeper. Clearly the concept of the synchronization is around the counter, although now I can see no way to associate that implicitly. Perhaps this is why Java does not have unqualified synchronized blocks and maybe I should remove it in which case the previous code would be:

class SynchronizedCounter:

def __init__(self): self.counter = 0

def increment(self): synchronize self.counter: self.counter += 1

...
I'd also like to see how 'asynchronize' works with condition variables, which seem to be the most common use for temporarily unlocking.

I will look into that.

...
(Your example of how code would do this without asynchronize has a bug, by the way; if the I/O operation raises an exception, the finally clause will attempt to release an already released lock.)

Yes I need another try/finally in there. Thanks.

...
I think the PEP would be clearer if it was considerably shorter and to the point, with fewer examples and a more exact specification.

Thanks for the advice, I will move my thinking in this direction.

-Michel

_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev

Sincerely yours, Roman A.Suzi -- - Petrozavodsk - Karelia - Russia - mailto:rnd@onego.ru -

Michel Pelletier

11:51 p.m.

...

On Mon, 16 Jun 2003, Michel Pelletier wrote:

...
def __init__(self): self.counter = 0

def increment(self): synchronize: self.counter += 1

What about just adding a parameter to try operator?

def increment(self): try self.counter_lock: self.counter += 1

Because I would like to remove the user-visible lock entirely. Although your idea is similar to PEP 310, which proposes a new keyword "with". I think using the try keyword for this is inapropriate. -Michel

Roman Suzi

11:43 p.m.

On Mon, 16 Jun 2003, Michel Pelletier wrote:

...

...
On Mon, 16 Jun 2003, Michel Pelletier wrote:

...
def __init__(self): self.counter = 0

def increment(self): synchronize: self.counter += 1

What about just adding a parameter to try operator?

def increment(self): try self.counter_lock: self.counter += 1

Because I would like to remove the user-visible lock entirely. Although your idea is similar to PEP 310, which proposes a new keyword "with". I think using the try keyword for this is inapropriate.

But is it such a good idea to do? What if critical section is at two or more different places at once? How will you deal with def increment(self): try self.counter_lock: self.counter += 1 def decrement(self): try self.counter_lock: self.counter -= 1 (Suppose, it's not simple or elegant to do it in one place: def change(self, delta=1): try self.counter_lock: self.counter += delta ) As for with, it could be added as follows: try with lock: lalalala

...

-Michel

Sincerely yours, Roman A.Suzi -- - Petrozavodsk - Karelia - Russia - mailto:rnd@onego.ru -

Michel Pelletier

16 Jun 16 Jun

12:19 a.m.

...

But is it such a good idea to do? What if critical section is at two or more different places at once? How will you deal with

def increment(self): try self.counter_lock: self.counter += 1

def increment(self): synchronize self.counter: self.counter += 1

...

def decrement(self): try self.counter_lock: self.counter -= 1

def decrement(self): synchronize self.counter: self.counter -= 1

...

(Suppose, it's not simple or elegant to do it in one place:

def change(self, delta=1): try self.counter_lock: self.counter += delta

def change(self, delta=1): synchronize self.counter: self.counter += data No explicit lock is necessary. Any object may be synchronized upon (except, perhaps, None). The first time an object is synchronized, a lock is invisibly associated with it behind the scenes, you cannot (and should not) access this lock. The lock exists for the life of the object it synchronizes. When a synchronize block is entered, the lock is acquire()d and and release()d when the block is exited. Very similar to the way Java does it: http://java.sun.com/docs/books/jls/second_edition/html/statements.doc.html#2... except that in addition I propose an 'asynchronize' keyword that is used inside a synchronized block to temporarily unlock it to do, for example, blocking IO, or any other blocking operation that does not require synchronization. -Michel

Jack Jansen

2:20 a.m.

On Monday, Jun 16, 2003, at 09:19 Europe/Amsterdam, Michel Pelletier wrote:

...

No explicit lock is necessary. Any object may be synchronized upon (except, perhaps, None). The first time an object is synchronized, a lock is invisibly associated with it behind the scenes, you cannot (and should not) access this lock. The lock exists for the life of the object it synchronizes. When a synchronize block is entered, the lock is acquire()d and and release()d when the block is exited.

I think this is a bad idea, after pondering it for a while[*]. There will always be situations where you want to lock multiple objects, and before you know it you'll end up with extra objects that hold no data but only a lock. And then it would have been better to design the language feature that way in the first place. Explicit is better than implicit:-) [*] I wondered for a while whether locking only a single object would maybe steer people away from potentially deadlocking code, but I believe it's the other way around: with explicit locks you actually have to think of the locks you need, whereas with implicit locks you don't, so you write deadlocking code more often. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

Michel Pelletier

8:16 a.m.

...

On Monday, Jun 16, 2003, at 09:19 Europe/Amsterdam, Michel Pelletier wrote:

...
No explicit lock is necessary. Any object may be synchronized upon (except, perhaps, None). The first time an object is synchronized, a lock is invisibly associated with it behind the scenes, you cannot (and should not) access this lock. The lock exists for the life of the object it synchronizes. When a synchronize block is entered, the lock is acquire()d and and release()d when the block is exited.

I think this is a bad idea, after pondering it for a while[*]. There will always be situations where you want to lock multiple objects,

Can you be more explicit? I'm not sure I understand. In Python as it is now, you cannot "lock" an object, only a lock (unless the object is like a condition variable, which proxies a lock). Any association between that lock and any number of objects is a concept that must be maintained in your head and visually in your code. PEP 319 proposes automating the simplest and most common cases of these associations.

...

and before you know it you'll end up with extra objects that hold no data but only a lock

What are extra objects? If your objects are no longer necessary they are garbage collected like all others, including their locks. is your concern memory consumption?

...

. And then it would have been better to design the language feature that way in the first place. Explicit is better than implicit:-)

[*] I wondered for a while whether locking only a single object would maybe steer people away from potentially deadlocking code, but I believe it's the other way around: with explicit locks you actually have to think of the locks you need, whereas with implicit locks you don't, so you write deadlocking code more often.

I belive the reverse, synchronize will reduce user error and deadlocking code. with explicit locks programmers will forget, or become confused, with when and how to explicitly lock and unlock. 'synchronize' locks at the beginning of the block and unlocks at the end. There is no forgetting. -Michel -Michel

Skip Montanaro

8:27 a.m.

Jack> There will always be situations where you want to lock multiple Jack> objects, Michel> Can you be more explicit? I'm not sure I understand. I have a multi-threaded XML-RPC server which, among lots of other bits of data maintains some "top 50" data (top 50 cities searched for, top 50 performers searched for, etc). Update time for that data is very fast relative to much of the other data maintained by the server. Rather than create a lock for each of the various "top 50" objects, I simply created a single top50_lock object and acquire and release it around manipulation of any of the various bits related to that stuff. Having a single lock means my code is simpler at the possible extra cost of only allowing a single thread into a larger chunk of code. OTOH, had I created multiple locks, performance might actually have gotten worse due to lock acquisition/release overhead. Obviously I could have done things differently. I could have coalesced all the top 50 data into a single object and locked it or created a separate lock for each item. Still, I agree with Jack that there are plenty of situations where you use one lock to lock multiple objects. (Consider the Python GIL as another example. ;-) Skip

Michel Pelletier

9:59 a.m.

Hi Skip!

...

Jack> There will always be situations where you want to lock multiple Jack> objects,

Michel> Can you be more explicit? I'm not sure I understand.

I have a multi-threaded XML-RPC server which, among lots of other bits of data maintains some "top 50" data (top 50 cities searched for, top 50 performers searched for, etc). Update time for that data is very fast relative to much of the other data maintained by the server. Rather than create a lock for each of the various "top 50" objects, I simply created a single top50_lock object and acquire and release it around manipulation of any of the various bits related to that stuff. Having a single lock means my code is simpler at the possible extra cost of only allowing a single thread into a larger chunk of code. OTOH, had I created multiple locks, performance might actually have gotten worse due to lock acquisition/release overhead.

Obviously I could have done things differently. I could have coalesced all the top 50 data into a single object and locked it or created a separate lock for each item.

synchronize item: would create a (hidden) lock for each item for you. Wouldn't this solve your problem of no two threads changing one item? or do changes to any one top 50 item *require* locking all 50? If they are independent that this is exactly the purpose PEP 319 serves.

...

Still, I agree with Jack that there are plenty of situations where you use one lock to lock multiple objects. (Consider the Python GIL as another example. ;-)

Isn't the interpreter one object in this case? Does the GIL lock anything else other than the interpreter? -Michel

Roman Suzi

11:37 a.m.

On Mon, 16 Jun 2003, Michel Pelletier wrote:

...

...
Obviously I could have done things differently. I could have coalesced all the top 50 data into a single object and locked it or created a separate lock for each item.

synchronize item:

would create a (hidden) lock for each item for you. Wouldn't this solve your problem of no two threads changing one item? or do changes to any one top 50 item *require* locking all 50? If they are independent that this is exactly the purpose PEP 319 serves.

Oh no... I have not thought about locking objects, but locking places of a program from multiple entry. I do not think the whole business of locking _objects_ is appropriate to do. And thus I think Michel should think about the implementation of his locking infrastructure. Even in high-level terms implementation seems fuzzy and prone to semantics misunderstandings. And this became clear to me when all this discussion of top50 was presented. I think anonymous locking is bad idea. Locking based on syntactic containment (?) is also problematic. The only obvious way is to have explicit lock. Thus I understand that syncronize: lalala means only that it could not be re-entered from other thread and not that all objects encountered inside are "locked". This makes syncronize self.counter: self.counter += 1 looking silly. And I must also say that examples from the PEP aren't convincing. All that implicit locking on target objects is looking like magic. And as we know explicit is better than implicit (*). This way I will never know why my program consumed so much memory and it takes so long to make simple things: and the answer is implicit locks lurking here and there, with every object. (*sorry for being the first person to use the word implicit in the PEP name ;-) One more solution is to have same target-object blocks but with explicit instantiation: def lalala(self, queue): syncronize AdHocLock(queue): do_something(queue) The main idea here AdHocLock being a singleton which creates a lock and shepherds a dict of locks, keys of which are for example weak refs to objects. So, making def lalala(self, number): syncronize AdHocLock(number): do_something(number) is no contradiction. We can as well have a constant: def lalala3(self): syncronize AdHocLock(3): do_something This is not so good because it creates and maintains global structure (locks) and thus two modules could have a conflict over same object id or number. Much better solution, IMHO, is to use traditional explicit locks: no need for new keywords, no need to worry that objects will have a shadow of accompanying locks. Sincerely yours, Roman Suzi -- rnd@onego.ru =\= My AI powered by GNU/Linux RedHat 7.3

Greg Ewing

4:44 p.m.

Roman Suzi :

...

I have not thought about locking objects, but locking places of a program from multiple entry.

If we're to have implicit lock objects, this is perhaps the right way to think about it. But the locking needs to apply to more than just one code block. Consider def increment(self): synchronize: self.count += 1 def decrement(self): synchronize: self.count -= 1 If the two statements are synchronized independently, this will not work. I think the right level to synchronize at is a whole class: synchronized class Counter: def __init__(self): self.counter = 0 def increment(self): self.count += 1 def decrement(self): self.count -= 1 the semantics being that each instance of Counter gets its own lock object, and the lock is acquired before any method in it is entered and released when it is exited. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Skip Montanaro

12:13 p.m.

>> I have a multi-threaded XML-RPC server which, among lots of other >> bits of data maintains some "top 50" data (top 50 cities searched >> for, top 50 performers searched for, etc). Update time for that data >> is very fast relative to much of the other data maintained by the >> server. Michel> synchronize item: Michel> would create a (hidden) lock for each item for you. Wouldn't Michel> this solve your problem of no two threads changing one item? or Michel> do changes to any one top 50 item *require* locking all 50? If Michel> they are independent that this is exactly the purpose PEP 319 Michel> serves. I can see I wasn't clear in my original post. Let me be more concrete. I have a class with several objects with store information about the top 50 searches for musicians and cities on the Mojam and Musi-Cal websites: class DBServer(genericserver.GenericServer): doratings = 1 poolsize = 5 def __init__(self, address, handlerclass): genericserver.GenericServer.__init__(self, address, handlerclass) self.conn_pool = Queue.Queue(self.poolsize) self.init_locks() ... self.top_50_perfs = {} self.top_50_cities = {} self.top_50_mojam = {} self.top_50_musi_cal = {} ... Instead of creating a lock to protect each of those four objects I create a single lock for that purpose: def init_locks(self): self.sql_lock = threading.RLock() self.query_lock = threading.RLock() self.top_50_lock = threading.RLock() self.namemap_lock = threading.RLock() self.dump_lock = threading.RLock() Your proposal would suggest I (implicitly) create a lock for each of those top_50_* dictionaries. I think my code would be more complex. I think this is precisely the sort of case Jack was alluding to with his "one lock, multiple objects" case. In this case it's overkill for me to create separate locks for each object because access times for those data are fast. There isn't likely to be any contention for that lock. For my applications I would be more than happy with a more succinct (and safe) way to write: lock.acquire() try: block finally: lock.release() I don't really care what the syntax is, but I think implicit per-object locks are unnecessary. >> Still, I agree with Jack that there are plenty of situations where >> you use one lock to lock multiple objects. (Consider the Python GIL >> as another example. ;-) Michel> Isn't the interpreter one object in this case? Does the GIL Michel> lock anything else other than the interpreter? Depends on what level you look at it. Sure, the interpreter is a single object, but it's a very complex object which contains lots of other subobjects. There are lots of places in the Python code where assumptions are made that because the GIL is being held, the (in)consistency of a particular object at that point in time isn't crucial. You know no other thread can access that object right then because the GIL is held by the currently executing thread. As long as the object's state is made consistent by the time you release the GIL you're golden. In essence, the GIL is Jack's "one lock, multiple objects" case taken to the extreme. Skip

Skip Montanaro

8:38 a.m.

Missed this on the first read: Michel> with explicit locks programmers will forget, or become confused, Michel> with when and how to explicitly lock and unlock. 'synchronize' Michel> locks at the beginning of the block and unlocks at the end. Michel> There is no forgetting. You still need to remember to 'synchronize' access to the data. That's the bigger problem in my mind. It seems to me that the more locks I need to manage, the harder it will be to identify potential deadlock situations. With fewer locks (I use five RLock objects and a Queue in my XML-RPC server) I think it's easier to compartmentalize functionality in my feeble brain and avoid deadlock, with some potential loss of execution overlap. Skip

Greg Ewing

3:27 p.m.

Michel Pelletier :

...

I belive the reverse, synchronize will reduce user error and deadlocking code. with explicit locks programmers will forget, or become confused, with when and how to explicitly lock and unlock. 'synchronize' locks at the beginning of the block and unlocks at the end. There is no forgetting.

It may prevent you from forgetting to release a lock, which is probably a very useful thing, but there are other ways that deadlocks can occur, such as when you need to acquire multiple locks and aren't careful what order you acquire them in. Not having to think explicitly about locks may increase the frequency of that kind of problem. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Jack Jansen

17 Jun 17 Jun

2:01 a.m.

Most points have been addressed by Skip and Greg and others already, but there's one I'd like to elaborate a little on: On Monday, Jun 16, 2003, at 17:16 Europe/Amsterdam, Michel Pelletier wrote:

...

...
[*] I wondered for a while whether locking only a single object would maybe steer people away from potentially deadlocking code, but I believe it's the other way around: with explicit locks you actually have to think of the locks you need, whereas with implicit locks you don't, so you write deadlocking code more often.

I belive the reverse, synchronize will reduce user error and deadlocking code. with explicit locks programmers will forget, or become confused, with when and how to explicitly lock and unlock.

The problem is if one piece of code has synchronise a: ... synchronise b: ... and somewhere else you have synchronise b: ... synchronise a: ... If you use fine-grained locking this is something you always have to be aware of. In C-class languages it requires only discipline (don't call any subroutines outside of your module while holding a lock, basically), in Python you can forget it because every single statement or expression can be calling out all over the place. Note that this same problem turns up with Greg's "synchronised class" idea: if the language makes locking easy people will overuse it and it will come back to bite you (or, probably, a user of your module). -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

Michel Pelletier

12:36 p.m.

On Tuesday 17 June 2003 04:01, Jack Jansen wrote:

...

but there's one I'd like to elaborate a little on:

Michel Pelletier said:

...
I belive the reverse, synchronize will reduce user error and deadlocking code. with explicit locks programmers will forget, or become confused, with when and how to explicitly lock and unlock.

The problem is if one piece of code has synchronise a: ... synchronise b: ... and somewhere else you have synchronise b: ... synchronise a: ...

As you pointed out, the programmer must be aware of this when synchronizing with any mechanism. Would you prefer the manual say "don't do the above" or "don't do the below": lock1 = thread.allocate_lock() lock2 = thread.allocate_lock() lock1.acquire() try: lock2.acquire() try: ... finally: lock2.release() finally: lock1.release() # and somewhere else you have lock2.acquire() try: lock1.acquire() try: ... finally: lock1.release() finally: lock2.release() Maybe I did this wrong, but aren't the two (and Greg's "synchronized class") all susceptible to this problem and it's not specificly a failure of the 'synchronize' keyword? Thanks for your comments Jack, I'm going to add this to the discussion section of the PEP. -Michel

Jack Jansen

1:04 p.m.

On dinsdag, jun 17, 2003, at 21:36 Europe/Amsterdam, Michel Pelletier wrote:

...

Maybe I did this wrong, but aren't the two (and Greg's "synchronized class") all susceptible to this problem and it's not specificly a failure of the 'synchronize' keyword?

Yes, all mechanisms are susceptible to the same problem, they're probably all three functionally equivalent (i.e. anything you can code with one you can code with the other). The point I'm trying to make is that designing your locks is hard work especially if there are many locks. Let's for the sake of argument say that the amount of work to get things right is quadratic in the number of locks. This means that any language construct that invites people to create many locks will make it more difficult to get the code right. I realise the argument I make sound pedantic (let's not make it too easy to do locking, so that only people who know what they're doing will use locking), but that's the way I actually do feel about the subject. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -

Greg Ewing

4:17 p.m.

...

Note that this same problem turns up with Greg's "synchronised class" idea: if the language makes locking easy people will overuse it and it will come back to bite you

I don't think that should be used as a reason not to provide higher-level facilities such as synchronised classes. It just means we need to keep clearly in mind that they're not a magical solution that enables people to write threaded code without understanding the issues. They will be tools for knowledgable users, not substitutes for said knowledge, and they should be documented as such. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Jack Jansen

18 Jun 18 Jun

4:28 a.m.

On Wednesday, Jun 18, 2003, at 01:17 Europe/Amsterdam, Greg Ewing wrote:

...

...
Note that this same problem turns up with Greg's "synchronised class" idea: if the language makes locking easy people will overuse it and it will come back to bite you

I don't think that should be used as a reason not to provide higher-level facilities such as synchronised classes. It just means we need to keep clearly in mind that they're not a magical solution that enables people to write threaded code without understanding the issues.

I had a half-baked idea yesterday that I'd like to dump here. I haven't thought it over, so it's probably bogus anyway, but still here goes:-) The main deadlock problem is acquiring a second lock while you already hold a lock. But completely disallowing this in the runtime system is overly restrictive. So, what we need is knowledge about which locks we can acquire and which ones we can't. In other words, if there is a statement synchronize lock: code then while inside "code" you will not be allowed to do any more synchronize statements. (I'm re-using the synchronize statement here, but the object is really a lock, not a general object, and I'm not interested in syntax right now. Also, this idea would work as well, or maybe even better, with Greg's synchronized class scheme). Whether an attempt to do so results in an error or warning I'm not sure. If the author of the code has investigated interaction between "lock", "otherlock" and "thirdlock" and believes the interaction is safe then you use the form synchronize lock allowing otherlock, thirdlock: code and "code" will be allowed to synchronize on the other two locks. This scheme puts the annotation at the outer lock, an alternative would be to put it at the inner lock. That has its advantages too, because if you know your code does not block nor acquire locks you can say something like "synchronize lock allow_holding *". A third alternative would be to completely decouple the annotation from the locking code, with a statement that says "Locks lock, otherlock and thirdlock are mutually safe". -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

Jeff Epler

16 Jun 16 Jun

5:37 a.m.

On Mon, Jun 16, 2003 at 02:19:19AM -0500, Michel Pelletier wrote:

...

def change(self, delta=1): synchronize self.counter:

< self.counter += delta # typo corrected by jepler

...

No explicit lock is necessary. Any object may be synchronized upon (except, perhaps, None). The first time an object is synchronized, a lock is invisibly associated with it behind the scenes, you cannot (and should not) access this lock. The lock exists for the life of the object it synchronizes. When a synchronize block is entered, the lock is acquire()d and and release()d when the block is exited.

I don't see how this can possibly work. It looks like self.counter is an int, so the statement synchronize self.counter: ... must be using some particular int (say, 3) for purposes of synchronization. What sense does this make? (and where can you store the lock, since an int is immutable and can't have new attributes created?) On the other hand, if the thing you're locking is the counter attribute of slot (and ignoring for a moment where the lock is stored) then what if self.counter is a list but id(self.counter) == id(globallist)? Then the 'synchronize' will not prevent these two snippets from executing at the same time: def change(self, delta=1): synchronize self.counter: for i in range(delta): self.counter.append(i) synchronize globallist: globallist.pop() globallist.pop() could now see a different item than delta-1 My other question concerns the 'asynchronize' portion of your proposal. Is this from Java, or is it your own innovation? I didn't turn up anything about it in several web searches, but I'm not familiar enough with java to know for sure. Jeff

Michel Pelletier

8:24 a.m.

...

On Mon, Jun 16, 2003 at 02:19:19AM -0500, Michel Pelletier wrote:

...
def change(self, delta=1): synchronize self.counter:

< self.counter += delta # typo corrected by jepler

...
No explicit lock is necessary. Any object may be synchronized upon (except, perhaps, None). The first time an object is synchronized, a lock is invisibly associated with it behind the scenes, you cannot (and should not) access this lock. The lock exists for the life of the object it synchronizes. When a synchronize block is entered, the lock is acquire()d and and release()d when the block is exited.

I don't see how this can possibly work. It looks like self.counter is an int, so the statement synchronize self.counter: ... must be using some particular int (say, 3) for purposes of synchronization. What sense does this make?

Hmm good point, integer objects are a special case, they are shared and are thus a bad example. Perhaps only instances should be synchronizable.

...

(and where can you store the lock, since an int is immutable and can't have new attributes created?)

that's up to the implementation. Lock association does not effect mutability.

...

On the other hand, if the thing you're locking is the counter attribute of slot (and ignoring for a moment where the lock is stored) then what if self.counter is a list but id(self.counter) == id(globallist)?

If they have the same id() they are the same object and thus the same associated lock. the below code will be prevented from executing at the same time.

...

Then the 'synchronize' will not prevent these two snippets from executing at the same time: def change(self, delta=1): synchronize self.counter: for i in range(delta): self.counter.append(i)

synchronize globallist: globallist.pop() globallist.pop() could now see a different item than delta-1

My other question concerns the 'asynchronize' portion of your proposal. Is this from Java, or is it your own innovation? I didn't turn up anything about it in several web searches, but I'm not familiar enough with java to know for sure.

Yep that's my idea; but I doubt there's no precedence for it. -Michel

Michel Pelletier

11:55 a.m.

...

...
I don't see how this can possibly work. It looks like self.counter is an int, so the statement synchronize self.counter: ... must be using some particular int (say, 3) for purposes of synchronization. What sense does this make?

Hmm good point, integer objects are a special case, they are shared and are thus a bad example. Perhaps only instances should be synchronizable.

...
(and where can you store the lock, since an int is immutable and can't have new attributes created?)

that's up to the implementation. Lock association does not effect mutability.

I should add that I am experimenting with this in Jython, not CPython which is why I said it's up to the implementation. I immagine CPython would add some unitialized behind-the-scenes pointer to a lock object and lazily initialize and lock it when the object is first sychronized. Any subsequent asynchronizing or sychronizing would use this lock until the object is garbage collected.

...

...
On the other hand, if the thing you're locking is the counter attribute of slot (and ignoring for a moment where the lock is stored) then what if self.counter is a list but id(self.counter) == id(globallist)?

If they have the same id() they are the same object and thus the same associated lock. the below code will be prevented from executing at the same time.

Ah I'm going over all the emails I got today for my next revision and sorry I missed where you said "attribute of the slot" the first time I read it. You meant, I gather, sychronizing on the slot and not the value it stores. Sorry to confuse things. I do not think sychronizing on the slot is the right thing (as I said above). Thanks for everyones comments, please keep sending them if you have them. -Michel

Greg Ewing

3:40 p.m.

Michel Pelletier :

...

Hmm good point, integer objects are a special case, they are shared and are thus a bad example. Perhaps only instances should be synchronizable.

In the presence of new-style classes, how do you define an "instance"? Anyway, the problem here isn't what kind of object was used, it's the way the programmer used it (i.e. locking on an object that wasn't going to stay around for the duration of the operation). I can't really see a way of preventing this kind of stupidity by restricting what types of objects can be locked -- you can lose a reference to any type of object. Maybe this is an argument against having implicit lock objects? If the programmer has to explicitly create and keep track of the lock object, he might look after it a bit more carefully. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Michel Pelletier

1 a.m.

...

I'd also like to see how 'asynchronize' works with condition variables, which seem to be the most common use for temporarily unlocking.

Hmm... I think the 'synchronize' keyword would make condition variables simpler, because they would not need to be associated with their own lock, or rather, the lock currently associated with them would not need to be used. Given the example in: http://www.python.org/doc/current/lib/condition-objects.html the psedo-code would become: # Consume one item synchronize cv: while not an_item_is_available(): cv.wait() get_an_available_item() # Produce one item synchronize cv: make_an_item_available() cv.notify() there is a problem here, however, and I *think* this is the question you are asking. The manual states " The wait() method releases the lock, and then blocks until it is awakened by a notify() or notifyAll() call for the same condition variable in another thread. Once awakened, it re-acquires the lock and returns. It is also possible to specify a timeout. " Is the question your asking, How does 'wait()' unlock a hidden lock? (FYI, Java does it by making all objects condition variables, wait, notify, and notifyAll are methods of java.lang.Object) Perhaps a __x__ method could provide access to the "hidden" lock for wait and asychronize would not be used. or, wait() could be changed to do nothing with the lock and simply wait() inside an asynchronize block: # Consume one item synchronize cv: while not an_item_is_available(): asynchronize: cv.wait() get_an_available_item() -Michel

7611

Age (days ago)

7614

Last active (days ago)

List overview

Download

33 comments

12 participants

participants (12)

Aahz
andrew cooke
Greg Ewing
Guido van Rossum
Jack Jansen
Jeff Epler
kbk＠shore.net
martin＠v.loewis.de
Michel Pelletier
michel＠dialnetwork.com
Roman Suzi
Skip Montanaro

New PEP: 319

michel＠dialnetwork.com

Michel Pelletier

andrew cooke

Michel Pelletier

Michel Pelletier

Roman Suzi

Michel Pelletier

Roman Suzi

Michel Pelletier

Michel Pelletier

Michel Pelletier

Roman Suzi

Michel Pelletier

Michel Pelletier

Michel Pelletier

Michel Pelletier

tags

participants (12)