[Python-ideas] fork

Wed Aug 5 16:30:24 CEST 2015

On Aug 4, 2015, at 14:03, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 04.08.2015 21:38, Andrew Barnert wrote:
>> I think anyone who finds the complexity of concurrent.futures too daunting to even attempt to learn it should not be working on any code that uses less explicit concurrency.
> 
> I am sorry because I disagree here with you.
> 
>> I have taught concurrent.futures to rank novices in a brief personal session or a single StackOverflow answer and they responded, "Wow, I didn't realize it could be this simple".
> 
> Nobody says that concurrent.futures is not an vast improvement over previous approaches. But it is still not the end of the line of simplifications.
> 
>> Someone who can't grasp it is almost certain to be someone who introduces races all over your code and can't even understand the problem, much less debug it.
> 
> Nobody wants races, yet everybody still talks about them. Don't allow races in the first place and be done with it.

What does that even mean? How would you not allow races? If you let people throw arbitrary tasks at a thread pool, with no restriction on mutable shared state, you've allowed races.

>> Not true. The language clearly defines when each step happens. The a.__add__ method is called, then the result is assigned to a, then the statement finishes. (Then, in the next statement, nothing happens--except, because this is happening in the interactive interpreter, and it's an expression statement, after the statement finishes doing nothing, the value of the expression is assigned to _ and its repr is printed out.)
> 
> Where can find this definition in the docs?
> 
> To me, we are talking about class customization as described on reference/datamodel.html. Seems like an implementation detail, not a language detail.

No, the data model is a feature of the language, not one specific implementation. The fact that you can define classes that work the same way as builtin types like int is a fundamental feature. It's something Guido and others worked very hard on making true back in Python 2.2-2.3. It's one of the things that makes Python or C++ more pleasant to use than Tcl or Java. Any implementation that didn't do the same would not be Python, and would not run a good deal of Python code.

> I am not saying, CPython doesn't do it like that, but I saying the Python language could support lazy evaluation and not disagreeing with the docs.
> 
>> This ordering relationship may be very important if the variable a is shared by multiple threads, especially if more than one thread may modify it, especially if you're using non-atomic operations like += (where another thread can read, use, and assign the variable between the __add__ call and the assignment). If a references a mutable object with an __iadd__ method, the variable doesn't even need to be shared, only the value, for this to matter. The only way to safely ignore these problems is to never share any variables or any mutable values between threads.
> 
> Mutual variables are global variables. And these have gone out of style quite some time ago.

No. Shared values include global variables, nonlocal variables used by two closures from the same scope, attributes of objects passed to both functions, members of collections passed to both functions, etc. The existence of all of these other things is why global variables are not necessary. They have many advantage over globals, allowing you to better control how state is shared, to share it reentrantly, to make it more explicit in the code, etc. But because they have all the same benefits, they also have the exact same race problem when used to share state between threads.

> Btw. this is races again and I thought we agreed on not having them because nobody really can/wants to debug them.

And how do you propose "not having them"?

It's not impossible to write purely functional code that doesn't use any mutable state, in which case it doesn't matter whether your state is shared. But the fact that your example uses += proves that this isn't your intention. If you take the code from your example and run it in two threads simultaneously, you have a race. The fact that you didn't intend to create a race because you don't understand that doesn't mean the problem isn't there, it just means you have no idea you've just written buggy code and no idea how to test for it or debug it.

And that's exactly the problem. What makes concurrent code with shared state hard, more than anything else, is people who don't realize what's hard about it and write code that seems to work but doesn't. Making it easier for such people to write broken code without even realizing they're doing so is not a good thing.