[Chicago] Kickstarter Fund to get rid of the GIL

Alex Gaynor alex.gaynor at gmail.com
Mon Jul 25 01:24:54 CEST 2011


On Sun, Jul 24, 2011 at 4:18 PM, John Stoner <johnstoner2 at gmail.com> wrote:

> Jython's not bad. I've used it a lot, and it plays well with lots of Java
> APIs. Pretty slick, actually. I hear Java 1.7 has some new dynamic features
> at the JVM level. I always imagined Jython would run a lot faster if it took
> advantage of them. Tal, do you know if there's any work on that? Googling
> around a bit I'm not seeing much.
>
>
> On Sun, Jul 24, 2011 at 4:32 PM, Joshua Herman <zitterbewegung at gmail.com>wrote:
>
>> At least erlang works for the use cases. I wasn't aware that Jython
>> was that powerful I will have to play with it.
>>
>> On Sun, Jul 24, 2011 at 3:46 PM, Tal Liron <tal.liron at threecrickets.com>
>> wrote:
>> > There is an alternative: Jython, which is Python on the JVM, and has no
>> GIL.
>> > It's real, it works, and has a very open community. If you want to do
>> > high-concurrency in Python, it's the way to go. (And it has other
>> advantages
>> > and disadvantages, of course.)
>> >
>> >
>> > I am always a bit frightened by community attempts to create new virtual
>> > machines for favorite languages in order to solve problem X. This shows
>> a
>> > huge under-estimation of what it means to create a robust, reliable,
>> > performative generic platform. Consider how many really reliable
>> versions of
>> > the C standard library out there -- and how many decades they took to
>> > mature, even with thousands of expert eyes poring over the code and
>> testing
>> > it. And this is without duck typing (or ANY typing), data integrity,
>> scoping
>> > (+call/cc), tail recursion, or any other of the other huge (and
>> exciting)
>> > challenges required to run a dynamic language like Python.
>> >
>> >
>> > So, it's almost amusing to see projects like Rubinius or Parrot come to
>> be.
>> > Really? This is the best use of our time and effort? I'm equally
>> impressed
>> > by the ballsiness of Erlang to create a new virtual machine from
>> scratch.
>> >
>> >
>> > But those are rather unique histories. CPython has it's own unique
>> history.
>> > Not many people realize this, but Python is about 6 years older than
>> Java,
>> > and the JVM would take another decade before reaching prominence.
>> JavaScript
>> > engines (running in web browsers only) at the time were terrible, and
>> Perl
>> > was entirely interpreted (no VM). So, in fact, CPython was written where
>> > there was no really good platform for dynamic languages. It wasn't a
>> matter
>> > of hubris ("not invented here") to build a VM from scratch; there was
>> simply
>> > no choice.
>> >
>> >
>> > Right now, though, there are many good choices. People like Rich Hickey
>> > (Clojure) and Martin Odersky (Scala) have it right in targeting the JVM,
>> > although both projects are also exploring .NET/Mono. If Python were
>> invented
>> > today, I imagine it also would start with "Jython," instead of trying to
>> > reinvent the wheel (well, reinvent a whole damn car fleet, really, in
>> terms
>> > of the work required).
>> >
>> >
>> > One caveat: I think there is room for "meta-VM" projects like PyPy and
>> LLVM.
>> > These signify a real progress in architecture, whereas "yet another
>> dynamic
>> > VM" does not.
>> >
>> >
>> > -Tal
>> >
>> >
>> > On 07/24/2011 02:56 PM, Jason Rexilius wrote:
>> >
>> >> I also have to quote:
>> >>
>> >> "rather that, for problems for which shared-memory concurrency is
>> >> appropriate (read: the valid cases to complain about the GIL), message
>> >> passing will not be, because of the marshal/unmarshal overhead (plus
>> data
>> >> size/locality ones)."
>> >>
>> >>
>> >> I have to say this is some of the best discussion in quite a while.
>> Dave's
>> >> passionate response is great as well as others. I think the rudeness,
>> or
>> >> not, is kinda besides the point.
>> >>
>> >> There is a valid point to be made about marshal/unmarshal overhead in
>> >> situations where data-manipulation-concurrency AND _user expectation_
>> or
>> >> environmental constraints apply.  I think that's why people have some
>> >> grounds to be unhappy with the GIL concept (for me its a concept) in
>> certain
>> >> circumstances. Tal is dead on in that "scalability" means different
>> things.
>> >>
>> >> Oddly, I'm more engaged in this as an abstract comp sci question than a
>> >> specific python question.  The problem set applies across languages.
>> >>
>> >> The question I would raise is if, given that an engineer understands
>> the
>> >> problem he is facing, are there both tools in the toolbox?  Is there an
>> >> alternative to GIL for the use-cases where it is not the ideal
>> solution?
>> >>
>> >> BTW, I will stand up for IPC as one of the tools in the toolbox to deal
>> >> with scale/volume/speed/concurrency problems.
>> >>
>> >>
>> >> On 7/24/11 1:58 PM, Tal Liron wrote:
>> >>>
>> >>> I would say that there's truth in both approaches. "Scalability" means
>> >>> different things at different levels of scale. A web example: the
>> >>> architecture of Twitter or Facebook is nothing like the architecture
>> of
>> >>> even a large Django site. It's not even the same problem field.
>> >>>
>> >>>
>> >>> A good threading model can be extremely efficient at certain scales.
>> For
>> >>> data structures that are mostly read, not written, synchronization is
>> >>> not a performance issue, and you get the best throughput possible in
>> >>> multicore situations. The truly best scalability would be achieved by
>> a
>> >>> combined approach: threading on a single node, message passing between
>> >>> nodes. Programming for that, though, is a nightmare (unless you had a
>> >>> programming language that makes both approaches transparent) and so
>> >>> usually at the large scale the latter approach is chosen. One
>> >>> significant challenge is to make sure that operations that MIGHT use
>> the
>> >>> same data structures are actually performed on the same node, so that
>> >>> threading would be put to use.
>> >>>
>> >>>
>> >>> So, what Dave said applies very well to threading, too: "you still
>> need
>> >>> to know what you're doing and how to decompose your application to use
>> >>> it."
>> >>>
>> >>>
>> >>> Doing concurrency right is hard. Doing message passing right is hard.
>> >>> Functional (persistent data structure) languages are hard, too. Good
>> >>> thing we're all such awesome geniuses, bursting with experience and a
>> >>> desire to learn.
>> >>>
>> >>>
>> >>> -Tal
>> >>>
>> >>>
>> >>> On 07/23/2011 01:40 PM, David Beazley wrote:
>> >>>
>> >>>>> "high performance just create multi processes that message" very
>> >>>>> rarely have
>> >>>>> I heard IPC and high performance in the same sentence.
>> >>>>>
>> >>>>> Alex
>> >>>>>
>> >>>> Your youth and inexperience is the only reason would make a statement
>> >>>> that ignorant. Go hang out with some people doing Python and
>> >>>> supercomputing for awhile and report back---you will find that almost
>> >>>> significant application is based on message passing (e.g., MPI). This
>> >>>> is because message passing has proven itself to be about the only
>> sane
>> >>>> way of scaling applications up to run across thousands to tens of
>> >>>> thousands of CPU cores.
>> >>>>
>> >>>> I speak from some experience as I was writing such software for large
>> >>>> Crays, Connection Machines, and other systems when I first discovered
>> >>>> Python back in 1996. As early as 1995, our group had done performance
>> >>>> experiments comparing threads vs. message passing on some
>> >>>> multiprocessor SMP systems and found that threads just didn't scale
>> or
>> >>>> perform as well as message passing even on machines with as few as 4
>> >>>> CPUs. This was all highly optimized C code for numerics (i.e., no
>> >>>> Python or GIL).
>> >>>>
>> >>>> That said, in order to code with message passing, you still need to
>> >>>> know what you're doing and how to decompose your application to use
>> it.
>> >>>>
>> >>>> Cheers,
>> >>>> Dave
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Chicago mailing list
>> >>>> Chicago at python.org
>> >>>> http://mail.python.org/mailman/listinfo/chicago
>> >>>
>> >>> _______________________________________________
>> >>> Chicago mailing list
>> >>> Chicago at python.org
>> >>> http://mail.python.org/mailman/listinfo/chicago
>> >>
>> >> _______________________________________________
>> >> Chicago mailing list
>> >> Chicago at python.org
>> >> http://mail.python.org/mailman/listinfo/chicago
>> >
>> > _______________________________________________
>> > Chicago mailing list
>> > Chicago at python.org
>> > http://mail.python.org/mailman/listinfo/chicago
>> >
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>>
>
>
>
> --
> blogs:
> http://johnstoner.wordpress.com/
> 'In knowledge is power; in  wisdom, humility.'
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>
There's a student at the university of colorado working on Jython
performance stuff using the new invokedynamic from Java 7.  No idea what his
progress is, but I'm not holding my breath.  Optimizing Jython, Jruby, and
every other JVM language seems to be playing to what the Hotspot team is
doing this week.  The JVM instruction set was not designed for dynamic
languages (or with them in mind), that's why even though JRuby is much
faster than MRI to my knowledge, PyPy can still smoke it (citation:
http://attractivechaos.github.com/plb/ , this benchmark ha 101 problems, not
the least of which is it's running an out of date PyPy, but it illustrates
my point).

Alex

-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20110724/23436250/attachment-0001.html>


More information about the Chicago mailing list