[Chicago] Kickstarter Fund to get rid of the GIL

Sun Jul 24 21:56:04 CEST 2011

I also have to quote:

"rather that, for problems for which shared-memory concurrency is 
appropriate (read: the valid cases to complain about the GIL), message 
passing will not be, because of the marshal/unmarshal overhead (plus 
data size/locality ones)."

I have to say this is some of the best discussion in quite a while. 
Dave's passionate response is great as well as others. I think the 
rudeness, or not, is kinda besides the point.

There is a valid point to be made about marshal/unmarshal overhead in 
situations where data-manipulation-concurrency AND _user expectation_ or 
environmental constraints apply.  I think that's why people have some 
grounds to be unhappy with the GIL concept (for me its a concept) in 
certain circumstances. Tal is dead on in that "scalability" means 
different things.

Oddly, I'm more engaged in this as an abstract comp sci question than a 
specific python question.  The problem set applies across languages.

The question I would raise is if, given that an engineer understands the 
problem he is facing, are there both tools in the toolbox?  Is there an 
alternative to GIL for the use-cases where it is not the ideal solution?

BTW, I will stand up for IPC as one of the tools in the toolbox to deal 
with scale/volume/speed/concurrency problems.

On 7/24/11 1:58 PM, Tal Liron wrote:
> I would say that there's truth in both approaches. "Scalability" means
> different things at different levels of scale. A web example: the
> architecture of Twitter or Facebook is nothing like the architecture of
> even a large Django site. It's not even the same problem field.
>
>
> A good threading model can be extremely efficient at certain scales. For
> data structures that are mostly read, not written, synchronization is
> not a performance issue, and you get the best throughput possible in
> multicore situations. The truly best scalability would be achieved by a
> combined approach: threading on a single node, message passing between
> nodes. Programming for that, though, is a nightmare (unless you had a
> programming language that makes both approaches transparent) and so
> usually at the large scale the latter approach is chosen. One
> significant challenge is to make sure that operations that MIGHT use the
> same data structures are actually performed on the same node, so that
> threading would be put to use.
>
>
> So, what Dave said applies very well to threading, too: "you still need
> to know what you're doing and how to decompose your application to use it."
>
>
> Doing concurrency right is hard. Doing message passing right is hard.
> Functional (persistent data structure) languages are hard, too. Good
> thing we're all such awesome geniuses, bursting with experience and a
> desire to learn.
>
>
> -Tal
>
>
> On 07/23/2011 01:40 PM, David Beazley wrote:
>
>>> "high performance just create multi processes that message" very
>>> rarely have
>>> I heard IPC and high performance in the same sentence.
>>>
>>> Alex
>>>
>> Your youth and inexperience is the only reason would make a statement
>> that ignorant. Go hang out with some people doing Python and
>> supercomputing for awhile and report back---you will find that almost
>> significant application is based on message passing (e.g., MPI). This
>> is because message passing has proven itself to be about the only sane
>> way of scaling applications up to run across thousands to tens of
>> thousands of CPU cores.
>>
>> I speak from some experience as I was writing such software for large
>> Crays, Connection Machines, and other systems when I first discovered
>> Python back in 1996. As early as 1995, our group had done performance
>> experiments comparing threads vs. message passing on some
>> multiprocessor SMP systems and found that threads just didn't scale or
>> perform as well as message passing even on machines with as few as 4
>> CPUs. This was all highly optimized C code for numerics (i.e., no
>> Python or GIL).
>>
>> That said, in order to code with message passing, you still need to
>> know what you're doing and how to decompose your application to use it.
>>
>> Cheers,
>> Dave
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago