
This is a blue sky idea. It can't happen in Python 3.x, and possibly not ever in cPython. I'm mostly hoping to get smarter people than me considering the issue. Synopsis Python, as a general rule, tries to be "safe" about things. If something isn't obviously correct, it tends to throw runtime errors to let you know that you need to be explicit about what you want. When there's not an obvious choice for type conversions, it raises an exception. You generally don't have to worry about resource allocation if you stay in python. And so on. The one glaring exception is in concurrent programs. While the tools python has for dealing with such are ok, there isn't anything to warn you when you fail to use those tools and should be. The goal of this proposal is to fix that, and get the Python interpreter to help locate code that isn't safe to use in concurrent programs. Existence Proof This is possible. Clojure is a dynamic language in the LISP family that will throw exceptions if you try mutating variables without properly protecting them against concurrent access. This is not to say that the Clojure solution is the solution, or even the right solution for Python. It's just to demonstrate that this can be done. Object Changes Object semantics don't need to change very much. The existing immutable types will work well in this environment exactly as is. The mutable types - well, we can no longer go changing them willy-nilly. But any language needs mutable types, and there's nothing wrong with the ones we have. Since immutable types don't require access protection *at all*, it might be worthwhile to add a new child of object, "immutable". Instances of this type would be immutable after creation. Presumably, the __new__ method of Python classes inheriting from immutable would be used to set the initial attributes, but the __init__ method might also be able to handle that role. However, this is a performance tweak, allowing user-written classes to skip any runtime checks for being mutated. Binding Changes One of the way objects are mutated is by changing their bindings. As such, some of the bindings might need to be protected. Local variables are fine. We normally can't export those bindings to other functions, just the values bound to them. So changing the binding can stay the same. The bound object can be exported to other threads of execution, but changing it will fall under the rules for changing objects. Ditto for nonlocals. On the other hand, rebindings of module and class and instance variables can be visible in other threads of execution, so they require protection, just like changing mutable objects. New Syntax The protection mechanism is the change to the language. I propose a single new keyword, "locking", that acts similar to the "try" keyword. The syntax is: 'locking' value [',' value]*':' suite The list of values are the objects that can be mutated in this lock. An immutable object showing up in the list of values is a TypeError. It's not clear that function calls should be allowed in the list of values. On the other hand, indexing and attributes clearly should be, and those can turn into function calls, so it's not clear they shouldn't be allowed, either. The locked values can be mutated during the body of the locking suite. For the builtin mutable types, this means invoking their mutating methods. For modules, classes and object instances, it means rebinding their attributes. Locked objects stay locked during function invocations in the suite. This means you can write utility functions that expect to be passed locked objects to mutate. A locking statement can be used inside of another locking statement. See the Implementation section for possible restrictions on this. Any attempt to mutate an object that isn't currently locked will raise an exception. Possibly ValueError, possibly a new exception class just for this purpose. This includes rebinding attributes of objects that aren't locked. Implementation There are at least two ways this can be implemented, both with different restrictions on the suite. While both of them can probably be optimized if it's know that there are no other threads of execution, checking for attempts to mutate unlocked objects should still happen. 1) Conventional locking All the objects being locked have locks attached to them, which are locked when before entering the suite. The implementation must order the locked object in some repeatable way, so that two locking statements that have more than one locked object in common will obtain the locks on those objects in the same order. This will prevent deadlocks. This method will require that the initial locking statement lock all objects that may be locked during the execution of it's suite. This may be a reason for allowing functions as locking values, as a way to get locks on objects that code called in the suite is going to need. Another downside is that the programmer needs to handle exceptions raised during the suite to insure that a set of related changes leaves the relevant objects in a consistent state. In this case, an optional 'except' clause should be added to the locking statement to hold such code. 2) Software Transactional Memory In an STM implementation, copies of the locked objects are created by the locking statement, and they original are "fingerprinted" in some way. The locking suite then runs. When the suite completes, the fingerprints of the originals are checked to see if some other thread of execution has changed them. If they haven't changed, they are replaced by the copies, and execution continues. If the originals have changed, the entire process starts over. In this implementation, the only actual locking is during the original fingerprinting process (to insure that a consistent state is captured) and at the end of the suite. FWIW, this is one of the models provided by Clojure. The restriction on the suite in this case is that running it twice - except for changes to the locked objects - needs to be acceptable. In this case, exceptions don't need to be handled by the programmer to insure consistency. If an exception happens during the execution of the suite, the original values are never replaced. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Mon, Oct 31, 2011 at 1:11 PM, Mike Meyer <mwm@mired.org> wrote:
This will basically run into the same problem that free-threading-in-CPython concepts do - the fine grained checks you need to implement it will kill your single-threaded performance. Since Python is a scripting language that sees heavy single-threaded use, that's not an acceptable trade-off. Software transactional memory does offer some hope for a more reasonable alternative, but that has its own problems (mainly I/O related). It will be interesting to see how PyPy's experiments in this space pan out. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Oct 30, 2011 at 8:21 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
These argument seems familiar. Oh, right, it's the "lack of performance will kill you." That was given as the reason that all of the following were unacceptable: - High level languages. - Byte-compiled languages. - Structured programming. - Automatic memory management. - Dynamic typing. - Object Oriented languages. All of those are believed (at least by their proponents) to make programming easier and/or faster at the cost of performance. The performance cost was "too high" when all of them when they were introduced, but they all became broadly accepted as the combination of increasing computing power (especially CPU support for them) and increasingly efficient implementation techniques drove that cost down to the point where it wasn't a problem except for very special cases.
Since Python is a scripting language that sees heavy single-threaded use, that's not an acceptable trade-off.
Right - few languages manage to grow one of those features without a name change of some sort, much less two (with the obvious exception of LISP). Getting them usually requires moving to a new language. That's one reason I said it might never make it into CPython. But the goal is to get people to think about fixing the problems, not dismiss the suggestion because of problems that will go away if we just wait long enough. For instance, the issue of single-threaded performance can be fixed by taking threading out of a library, and giving control of it to the interpreter. This possibility is why I said "thread of execution" instead of just "thread." If the interpreter knows when an application has concurrent execution, it also knows when there aren't any, so it can support an option not to make those checks until the performance issues go away.
Right - you can't do I/O inside a transaction. For writes, this isn't a problem. For reads, it does, since they imply binding and/or rebinding. So an STM solution may require a second mechanism designed for single statements to allow reads to happen. <mike

On Mon, 31 Oct 2011 10:59:56 -0700 Mike Meyer <mwm@mired.org> wrote:
Agreed, but killing performance by double digits in a new release is generally considered quite ugly by users. Also, I'm not convinced that such approaches really bring anything. My opinion is that good multi-threaded programming is achieved through careful abstraction and separation of concerns, rather than advanced language idioms. Regards Antoine.

On Mon, Oct 31, 2011 at 11:25 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
That may be why languages rarely adopt such features. The users won't put up with the cost hit for the development versions. Except for LISP, of course, whose users know the value of everything but the cost of nothing :-).
Doesn't that cover all kinds of good programming? But advanced language features are there because they are supposed to help with either abstraction or separation of concerns. Look at the list I presented again: - High level languages. - Byte-compiled languages. - Structured programming. - Automatic memory management. - Dynamic typing. - Object Oriented languages. All help with either abstraction or separation of concerns in some way (ok, byte-compilation the concerns are external, in that it's code portability). So do the features I'd like tosee. In particular, they let you separate code that in which concurrency is a concern from code where it isn't. Another aspect of this issue (and yet another possible reason that these features show up in new languages rather than being added to old ones) is that such changes usually require changing the way you think about programming. It takes a different mindset to program with while loops than with if & goto, or OO than procedural, or .... Similarly, it takes a different mindset to program in a language where changing an object requires special consideration. This may be to much of a straight-jacket for a multi-paradigm language like Python (though Oz manages it), but making the warnings ignorable defeats the purpose. <mike

On Mon, Oct 31, 2011 at 12:14 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Did you not read the next paragraph, the one where I explained how it helped separate issues? But you're right. Manually listing them isn't all that desirable, it's just better than what we have now. I initially saw a locking keyword with no list as implying that all such things should be locked automatically. I can't see how to do that with the locking implementation, so I dropped it. However, it can be done with an STM implementation. Hmm. That could be the distinguishing feature to deal with IO: If you don't have the list, you get an STM, and it does the copy/fingerprint dance when you mutate something. If you list a value, you get real locks so it won't retry and you can safely do IO. <mike

Hi, 2011/10/31 Mike Meyer <mwm@mired.org>
PyPy offers a nice platform to play with this kind of concepts. For example, even if it's not the best implementation, it's easy to add a __setattr__ to the base W_Object class, which will check whether the object is allowed to mutate. But you certainly will open a can of worms here: even immutable objects are modified (e.g str.__hash__ is cached) and many functions that you call will need to add their own locks, is it possible to avoid deadlocks in this case? -- Amaury Forgeot d'Arc

On Mon, Oct 31, 2011 at 7:33 AM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
Just what I need - another project. I'll go take a look at PyPy. In theory, things that don't change the externally visible behavior of an object don't need to be checked. The goal isn't to make the language perfectly safe, it's to make you be explicit about when you want to do something that might not be safe in a concurrent environment. This provides a perfect example of where an immutable subclass would be useful. It would add an __setattr__ that throws an exception. Then the string class (which would inheret from the immutable subclass) could cache it's hash by doing something like W_Object.__setattr__(self, "_cashed_hash", hash(self)) (possibly locked). <mike

On Sun, Oct 30, 2011 at 11:11 PM, Mike Meyer <mwm@mired.org> wrote:
The one glaring exception is in concurrent programs. ...
Object semantics don't need to change very much. The existing immutable types will work well in this environment exactly as is.
I think a state bit in the object header would be more than justified, if we could define immutability. Are strings immutable? Do they become immutable after the hash is cached and the string is marked Ready? Can a subtype of strings with mutable attributes (that are not involved in comparison?) still be considered immutable?
'locking' value [',' value]*':' suite
... The list of values are the objects that can be mutated in this lock.
You could already simulate this with a context manager ... it won't give you all the benefits (though with a lint-style tool, it might), but it will showcase the costs. (In terms of code beauty, not performance.) Personally, I think those costs would be too high, given the current memory model, but I suppose that is at least partly an empirical question. If there isn't a way to generate this list automatically, it is the equivalent of manual memory management. -jJ

On Mon, Oct 31, 2011 at 1:47 PM, Jim Jewett <jimjjewett@gmail.com> wrote:
No, it isn't. The difference is in how mistakes are handled. With manual memory management, references through unassigned or freed pointers are mistakes, but may not generate an error immediately. In fact, it's possible the program will run fine and pass all your unit tests. This is the situation we have now in concurrent programming: mutating a shared object without an appropriate lock is an error that probably passes silently, and it may well pass all your tests without a problem (constructing a test to reliably trigger such a big is an interesting problem in and of itself). While you can automatically manage memory, there are other resources that still have to be managed by hand (open files spring to mind). In some cases you might be able to handle them completely automatically, in others not. In either case, Python manages things so that reading from a file that hasn't been opened is impossible, and reading from one that has been closed generates an immediate error. The goal here is to move from where we are to a place similar to where handling files is, so that failing to properly deal with the possibility of concurrent access causes an error when it happens, not at a point distant in both time and space. BTW, regarding the performance issue. I figured out how to implement this so that the run time cost is zero aside from the lock & unlock steps. <mike

Mike Meyer wrote:
I don't think what you're suggesting would achieve this, though. The locking required for correctness often involves more than one object or more than one operation on an object. Consider new_balance = balance + deposit lock(balance) balance = new_balance unlock(balance) This wouldn't trigger any of your alarms, but it would still be wrong. -- Greg

On Mon, Oct 31, 2011 at 11:00 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:
You're right - I chose my words poorly. As stated, solving it would involve solving the halting problem. Replace the word "properly" with "at all". I.e. - if you don't think about a concurrent access and should have, it'll cause an error. If you think about it and get it wrong - well, nothing will prevent all bugs. Partially automated resource allocation doesn't prevent the programmer from writing bad code, and this is in that category. <mike

On Sun, Oct 30, 2011 at 8:11 PM, Mike Meyer <mwm@mired.org> wrote:
Do you mean that at any time attempting to mutate an unlocked object throws an exception? That would mean that all of my current code is broken. Do you mean, that inside the control of 'locking', you can't mutate an unlocked object? That still breaks lots of code that is safe. You can't use itertools.cycle anymore until that's updated in a completely unnecessary way: def cycle(iterable): # cycle('ABCD') --> A B C D A B C D A B C D ... saved = [] for element in iterable: yield element saved.append(element) *# throws an exception when called on a locked iterable* while saved: for element in saved: yield element I think the semantics of this need to be tightened up. Furthermore, merely *reading* an object that isn't locked can cause problems. This code is not thread-safe: if element in dictionary: return dictionary[element] so you have to decide how much safety you want and what cost we're willing to pay for this. --- Bruce

On Mon, Oct 31, 2011 at 3:58 PM, Bruce Leban <bruce@leapyear.org> wrote:
Yes, that's the idea. There are some exceptions, but you have to explicitly work around that restriction.
That would mean that all of my current code is broken.
Pretty much, yes. It's like adding garbage collection and removing alloc*/free. It's going to break a *lot* of code. That's why I said "not in 3. and possibly never in cPython."
According to what I wrote, yes, it does.Since the list being mutated is only visible inside the function, it doesn't need to be. It might be possible to figure out that this is the case at compile time and thus allow the code to run unmodified. But that's 1) hard, 2) will miss some cases, 3) seems like a corner case. This proposal would break enough code that not breaking this case doesn't seem to be worth the effort. That's a question that needs to be answered.
I think the semantics of this need to be tightened up.
That's why I brought it up. I'm trying to get more eyes on the issue.
You're right - it's not thread safe. However, it also doesn't suffer from the problem I'm trying to deal with, where you mutate an object in a way that leaves things broken, but won't be detected at that point. If it breaks because someone mutates the object underneath it, it'll throw an exception at that point. I know you can construct cases where that isn't so. Maybe we need two types of locking - one that allows readers, and one that doesn't. I could live with that, as you'd still have to consider the issue where you mutate the object. <mike

On Mon, Oct 31, 2011 at 4:19 PM, Mike Meyer <mwm@mired.org> wrote:
In order to make concurrent code slightly safer, you're going to break all existing programs that don't use concurrency. That seems to me like a new language, not Python. You've been on this list long enough to see the attention that's paid to backward compatibility.
I think the cases where non-thread-safe code won't throw an exception are numerous, for example, the equally trivial: if element not in dictionary: dictionary[element] = 0 heck even this is not safe: dictionary[element] +=1 If you're going to tackle thread safety, it should address more of the problem. These bugs are in many ways worse than mutating "an object in a way that leaves things broken, but won't be detected at that point." The above bugs may *never* be detected. I've come across bugs like that that were in code for many years before I found them (and I'm sure that's happened to others on this list as well). The first thing to do is identify the problems you want to solve and make sure that the problems are well understood. Then design some solutions. Starting with a bad solution to a fraction of the problem isn't a good start. --- Bruce

I wouldn't say it necessarily has to be a new language. All too many new languages have arisen when the problem was actually with some specific part of an implementation, or could have been fixed with a careful iteration. It might be slightly off-topic, but one of Python's strengths in my eyes has been very careful incremental improvements. Prior to 2.5 Python was of no interest to me due to various concerns that have now been fixed. Indeed the language is significantly more consistent and usable than it was in the past. The last thing I (and many others) would want to see is yet another language created, because an an existing excellent language's implementation couldn't keep up with the times, or surmount an obstacle. Innovations like PyPy, multiprocessing, and the currently debated coroutines are what give Python a chance in the future. I might be beating a dead dog, but the last thing Python should do is roll over and die because "it's too hard". Just wanted to point out that language != implementation. On Tue, Nov 1, 2011 at 12:07 PM, Bruce Leban <bruce@leapyear.org> wrote:

On Mon, Oct 31, 2011 at 1:11 PM, Mike Meyer <mwm@mired.org> wrote:
This will basically run into the same problem that free-threading-in-CPython concepts do - the fine grained checks you need to implement it will kill your single-threaded performance. Since Python is a scripting language that sees heavy single-threaded use, that's not an acceptable trade-off. Software transactional memory does offer some hope for a more reasonable alternative, but that has its own problems (mainly I/O related). It will be interesting to see how PyPy's experiments in this space pan out. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Oct 30, 2011 at 8:21 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
These argument seems familiar. Oh, right, it's the "lack of performance will kill you." That was given as the reason that all of the following were unacceptable: - High level languages. - Byte-compiled languages. - Structured programming. - Automatic memory management. - Dynamic typing. - Object Oriented languages. All of those are believed (at least by their proponents) to make programming easier and/or faster at the cost of performance. The performance cost was "too high" when all of them when they were introduced, but they all became broadly accepted as the combination of increasing computing power (especially CPU support for them) and increasingly efficient implementation techniques drove that cost down to the point where it wasn't a problem except for very special cases.
Since Python is a scripting language that sees heavy single-threaded use, that's not an acceptable trade-off.
Right - few languages manage to grow one of those features without a name change of some sort, much less two (with the obvious exception of LISP). Getting them usually requires moving to a new language. That's one reason I said it might never make it into CPython. But the goal is to get people to think about fixing the problems, not dismiss the suggestion because of problems that will go away if we just wait long enough. For instance, the issue of single-threaded performance can be fixed by taking threading out of a library, and giving control of it to the interpreter. This possibility is why I said "thread of execution" instead of just "thread." If the interpreter knows when an application has concurrent execution, it also knows when there aren't any, so it can support an option not to make those checks until the performance issues go away.
Right - you can't do I/O inside a transaction. For writes, this isn't a problem. For reads, it does, since they imply binding and/or rebinding. So an STM solution may require a second mechanism designed for single statements to allow reads to happen. <mike

On Mon, 31 Oct 2011 10:59:56 -0700 Mike Meyer <mwm@mired.org> wrote:
Agreed, but killing performance by double digits in a new release is generally considered quite ugly by users. Also, I'm not convinced that such approaches really bring anything. My opinion is that good multi-threaded programming is achieved through careful abstraction and separation of concerns, rather than advanced language idioms. Regards Antoine.

On Mon, Oct 31, 2011 at 11:25 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
That may be why languages rarely adopt such features. The users won't put up with the cost hit for the development versions. Except for LISP, of course, whose users know the value of everything but the cost of nothing :-).
Doesn't that cover all kinds of good programming? But advanced language features are there because they are supposed to help with either abstraction or separation of concerns. Look at the list I presented again: - High level languages. - Byte-compiled languages. - Structured programming. - Automatic memory management. - Dynamic typing. - Object Oriented languages. All help with either abstraction or separation of concerns in some way (ok, byte-compilation the concerns are external, in that it's code portability). So do the features I'd like tosee. In particular, they let you separate code that in which concurrency is a concern from code where it isn't. Another aspect of this issue (and yet another possible reason that these features show up in new languages rather than being added to old ones) is that such changes usually require changing the way you think about programming. It takes a different mindset to program with while loops than with if & goto, or OO than procedural, or .... Similarly, it takes a different mindset to program in a language where changing an object requires special consideration. This may be to much of a straight-jacket for a multi-paradigm language like Python (though Oz manages it), but making the warnings ignorable defeats the purpose. <mike

On Mon, Oct 31, 2011 at 12:14 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Did you not read the next paragraph, the one where I explained how it helped separate issues? But you're right. Manually listing them isn't all that desirable, it's just better than what we have now. I initially saw a locking keyword with no list as implying that all such things should be locked automatically. I can't see how to do that with the locking implementation, so I dropped it. However, it can be done with an STM implementation. Hmm. That could be the distinguishing feature to deal with IO: If you don't have the list, you get an STM, and it does the copy/fingerprint dance when you mutate something. If you list a value, you get real locks so it won't retry and you can safely do IO. <mike

Hi, 2011/10/31 Mike Meyer <mwm@mired.org>
PyPy offers a nice platform to play with this kind of concepts. For example, even if it's not the best implementation, it's easy to add a __setattr__ to the base W_Object class, which will check whether the object is allowed to mutate. But you certainly will open a can of worms here: even immutable objects are modified (e.g str.__hash__ is cached) and many functions that you call will need to add their own locks, is it possible to avoid deadlocks in this case? -- Amaury Forgeot d'Arc

On Mon, Oct 31, 2011 at 7:33 AM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
Just what I need - another project. I'll go take a look at PyPy. In theory, things that don't change the externally visible behavior of an object don't need to be checked. The goal isn't to make the language perfectly safe, it's to make you be explicit about when you want to do something that might not be safe in a concurrent environment. This provides a perfect example of where an immutable subclass would be useful. It would add an __setattr__ that throws an exception. Then the string class (which would inheret from the immutable subclass) could cache it's hash by doing something like W_Object.__setattr__(self, "_cashed_hash", hash(self)) (possibly locked). <mike

On Sun, Oct 30, 2011 at 11:11 PM, Mike Meyer <mwm@mired.org> wrote:
The one glaring exception is in concurrent programs. ...
Object semantics don't need to change very much. The existing immutable types will work well in this environment exactly as is.
I think a state bit in the object header would be more than justified, if we could define immutability. Are strings immutable? Do they become immutable after the hash is cached and the string is marked Ready? Can a subtype of strings with mutable attributes (that are not involved in comparison?) still be considered immutable?
'locking' value [',' value]*':' suite
... The list of values are the objects that can be mutated in this lock.
You could already simulate this with a context manager ... it won't give you all the benefits (though with a lint-style tool, it might), but it will showcase the costs. (In terms of code beauty, not performance.) Personally, I think those costs would be too high, given the current memory model, but I suppose that is at least partly an empirical question. If there isn't a way to generate this list automatically, it is the equivalent of manual memory management. -jJ

On Mon, Oct 31, 2011 at 1:47 PM, Jim Jewett <jimjjewett@gmail.com> wrote:
No, it isn't. The difference is in how mistakes are handled. With manual memory management, references through unassigned or freed pointers are mistakes, but may not generate an error immediately. In fact, it's possible the program will run fine and pass all your unit tests. This is the situation we have now in concurrent programming: mutating a shared object without an appropriate lock is an error that probably passes silently, and it may well pass all your tests without a problem (constructing a test to reliably trigger such a big is an interesting problem in and of itself). While you can automatically manage memory, there are other resources that still have to be managed by hand (open files spring to mind). In some cases you might be able to handle them completely automatically, in others not. In either case, Python manages things so that reading from a file that hasn't been opened is impossible, and reading from one that has been closed generates an immediate error. The goal here is to move from where we are to a place similar to where handling files is, so that failing to properly deal with the possibility of concurrent access causes an error when it happens, not at a point distant in both time and space. BTW, regarding the performance issue. I figured out how to implement this so that the run time cost is zero aside from the lock & unlock steps. <mike

Mike Meyer wrote:
I don't think what you're suggesting would achieve this, though. The locking required for correctness often involves more than one object or more than one operation on an object. Consider new_balance = balance + deposit lock(balance) balance = new_balance unlock(balance) This wouldn't trigger any of your alarms, but it would still be wrong. -- Greg

On Mon, Oct 31, 2011 at 11:00 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:
You're right - I chose my words poorly. As stated, solving it would involve solving the halting problem. Replace the word "properly" with "at all". I.e. - if you don't think about a concurrent access and should have, it'll cause an error. If you think about it and get it wrong - well, nothing will prevent all bugs. Partially automated resource allocation doesn't prevent the programmer from writing bad code, and this is in that category. <mike

On Sun, Oct 30, 2011 at 8:11 PM, Mike Meyer <mwm@mired.org> wrote:
Do you mean that at any time attempting to mutate an unlocked object throws an exception? That would mean that all of my current code is broken. Do you mean, that inside the control of 'locking', you can't mutate an unlocked object? That still breaks lots of code that is safe. You can't use itertools.cycle anymore until that's updated in a completely unnecessary way: def cycle(iterable): # cycle('ABCD') --> A B C D A B C D A B C D ... saved = [] for element in iterable: yield element saved.append(element) *# throws an exception when called on a locked iterable* while saved: for element in saved: yield element I think the semantics of this need to be tightened up. Furthermore, merely *reading* an object that isn't locked can cause problems. This code is not thread-safe: if element in dictionary: return dictionary[element] so you have to decide how much safety you want and what cost we're willing to pay for this. --- Bruce

On Mon, Oct 31, 2011 at 3:58 PM, Bruce Leban <bruce@leapyear.org> wrote:
Yes, that's the idea. There are some exceptions, but you have to explicitly work around that restriction.
That would mean that all of my current code is broken.
Pretty much, yes. It's like adding garbage collection and removing alloc*/free. It's going to break a *lot* of code. That's why I said "not in 3. and possibly never in cPython."
According to what I wrote, yes, it does.Since the list being mutated is only visible inside the function, it doesn't need to be. It might be possible to figure out that this is the case at compile time and thus allow the code to run unmodified. But that's 1) hard, 2) will miss some cases, 3) seems like a corner case. This proposal would break enough code that not breaking this case doesn't seem to be worth the effort. That's a question that needs to be answered.
I think the semantics of this need to be tightened up.
That's why I brought it up. I'm trying to get more eyes on the issue.
You're right - it's not thread safe. However, it also doesn't suffer from the problem I'm trying to deal with, where you mutate an object in a way that leaves things broken, but won't be detected at that point. If it breaks because someone mutates the object underneath it, it'll throw an exception at that point. I know you can construct cases where that isn't so. Maybe we need two types of locking - one that allows readers, and one that doesn't. I could live with that, as you'd still have to consider the issue where you mutate the object. <mike

On Mon, Oct 31, 2011 at 4:19 PM, Mike Meyer <mwm@mired.org> wrote:
In order to make concurrent code slightly safer, you're going to break all existing programs that don't use concurrency. That seems to me like a new language, not Python. You've been on this list long enough to see the attention that's paid to backward compatibility.
I think the cases where non-thread-safe code won't throw an exception are numerous, for example, the equally trivial: if element not in dictionary: dictionary[element] = 0 heck even this is not safe: dictionary[element] +=1 If you're going to tackle thread safety, it should address more of the problem. These bugs are in many ways worse than mutating "an object in a way that leaves things broken, but won't be detected at that point." The above bugs may *never* be detected. I've come across bugs like that that were in code for many years before I found them (and I'm sure that's happened to others on this list as well). The first thing to do is identify the problems you want to solve and make sure that the problems are well understood. Then design some solutions. Starting with a bad solution to a fraction of the problem isn't a good start. --- Bruce

I wouldn't say it necessarily has to be a new language. All too many new languages have arisen when the problem was actually with some specific part of an implementation, or could have been fixed with a careful iteration. It might be slightly off-topic, but one of Python's strengths in my eyes has been very careful incremental improvements. Prior to 2.5 Python was of no interest to me due to various concerns that have now been fixed. Indeed the language is significantly more consistent and usable than it was in the past. The last thing I (and many others) would want to see is yet another language created, because an an existing excellent language's implementation couldn't keep up with the times, or surmount an obstacle. Innovations like PyPy, multiprocessing, and the currently debated coroutines are what give Python a chance in the future. I might be beating a dead dog, but the last thing Python should do is roll over and die because "it's too hard". Just wanted to point out that language != implementation. On Tue, Nov 1, 2011 at 12:07 PM, Bruce Leban <bruce@leapyear.org> wrote:
participants (8)
-
Amaury Forgeot d'Arc
-
Antoine Pitrou
-
Bruce Leban
-
Greg Ewing
-
Jim Jewett
-
Matt Joiner
-
Mike Meyer
-
Nick Coghlan