[Chicago] Kickstarter Fund to get rid of the GIL

Alex Gaynor alex.gaynor at gmail.com
Mon Jul 25 02:57:49 CEST 2011


On Sun, Jul 24, 2011 at 5:38 PM, Tal Liron <tal.liron at threecrickets.com>wrote:

> JVM 7 will have some neat features, but they haven't been stabilized yet,
> and at this point it's mostly experimentation. Fact is, even though JVM 6
> has been out for a few years already, many deployments still stick to JVM 5.
> It does the job, and "upgrades" have their costs, money and otherwise. I
> choose JVM for my project not because of speed, but because of the maturity
> of the platform, which includes administration tools, monitoring, security,
> and several best-in-class 3rd party libraries. It's nice to know that
> performance is very high up there if I really need it (at which case I just
> "drop down" to Java, rather than use a dynamic JVM language).
>
>
> The whole Jython codebase could use some help... it's even messier than
> CPython's, if that's possible. There's a lot of room for optimization, even
> before igniting JVM 7 shortcuts, though it will surely be at the cost of
> regressions and stability.Luckily, there's a decent test suite, which makes
> it easy to experiment for. The Jython community would LOVE help, and it
> doesn't have to be just in terms of coding. Their recent big project was to
> move the whole codebase from Subversion to Mercurial. Another big item on
> the todo list is to get up to date with Python 3. (Jython = Python 2.5
> formally, though it has quite a few 2.6 additions.)
>
>
> Jython also has some nice collaboration with JRuby, including people who
> work on both projects. But, what I would make me happier is if there was
> real code sharing, allowing for a dynamic core that would work well for both
> projects.
>
>
> Anyway. I guess I'm always confused by what people mean by "faster." What
> are you trying to code for, exactly? Where is your bottleneck? What is your
> funding? It's more likely that (although not necessarily) what you really
> are looking for is "scalability," for which shear computational performance
> is likely not the real issue. If money is coming, getting more expensive,
> faster machines may do the trick better than any JVM 7 optimization.
>
>
> If you just want a command line tool that starts fast, JVM is *not* where
> you want to go. It has notoriously slow startup, for exactly those
> mechanisms that make it perform so well as it runs.
>
>
> Another way to look at "faster" is as a way to save money. Weird, huh? But
> consider Facebook's HipHop project. (Sorry that all of my examples are from
> the web arena; it's where I mostly work these days.) The issue was not that
> PHP was "slow," it was that when you have 1,000 machines running at 90% CPU,
> a faster PHP runtime means that you can use 800 machines, instead, for the
> same workload. A few orders of magnitude forward, and savings can be
> enormous.
>
>
> If you have a project with 1,000 machines running at 90% CPU, please hire
> me! It may be very worthwhile for you to create a more performant Python
> runtime (JVM-based or not), and I'd love to be paid to do that. :) And it
> would also make a lot of irrational Python speed freaks happy.
>
>
> -Tal
>
>
<minor derail>
No offense, but if you want a more performant Python runtime, it's here
today: http://speed.pypy.org/, no need to start from scratch.
</minor derail>

Alex


>
>
> On 07/24/2011 06:18 PM, John Stoner wrote:
>
>  Jython's not bad. I've used it a lot, and it plays well with lots of Java
>> APIs. Pretty slick, actually. I hear Java 1.7 has some new dynamic features
>> at the JVM level. I always imagined Jython would run a lot faster if it took
>> advantage of them. Tal, do you know if there's any work on that? Googling
>> around a bit I'm not seeing much.
>>
>> On Sun, Jul 24, 2011 at 4:32 PM, Joshua Herman <zitterbewegung at gmail.com<mailto:
>> zitterbewegung at gmail.**com <zitterbewegung at gmail.com>>> wrote:
>>
>>    At least erlang works for the use cases. I wasn't aware that Jython
>>    was that powerful I will have to play with it.
>>
>>    On Sun, Jul 24, 2011 at 3:46 PM, Tal Liron
>>    <tal.liron at threecrickets.com <mailto:tal.liron@**threecrickets.com<tal.liron at threecrickets.com>
>> >>
>>
>>    wrote:
>>    > There is an alternative: Jython, which is Python on the JVM, and
>>    has no GIL.
>>    > It's real, it works, and has a very open community. If you want
>>    to do
>>    > high-concurrency in Python, it's the way to go. (And it has
>>    other advantages
>>    > and disadvantages, of course.)
>>    >
>>    >
>>    > I am always a bit frightened by community attempts to create new
>>    virtual
>>    > machines for favorite languages in order to solve problem X.
>>    This shows a
>>    > huge under-estimation of what it means to create a robust, reliable,
>>    > performative generic platform. Consider how many really reliable
>>    versions of
>>    > the C standard library out there -- and how many decades they
>>    took to
>>    > mature, even with thousands of expert eyes poring over the code
>>    and testing
>>    > it. And this is without duck typing (or ANY typing), data
>>    integrity, scoping
>>    > (+call/cc), tail recursion, or any other of the other huge (and
>>    exciting)
>>    > challenges required to run a dynamic language like Python.
>>    >
>>    >
>>    > So, it's almost amusing to see projects like Rubinius or Parrot
>>    come to be.
>>    > Really? This is the best use of our time and effort? I'm equally
>>    impressed
>>    > by the ballsiness of Erlang to create a new virtual machine from
>>    scratch.
>>    >
>>    >
>>    > But those are rather unique histories. CPython has it's own
>>    unique history.
>>    > Not many people realize this, but Python is about 6 years older
>>    than Java,
>>    > and the JVM would take another decade before reaching
>>    prominence. JavaScript
>>    > engines (running in web browsers only) at the time were
>>    terrible, and Perl
>>    > was entirely interpreted (no VM). So, in fact, CPython was
>>    written where
>>    > there was no really good platform for dynamic languages. It
>>    wasn't a matter
>>    > of hubris ("not invented here") to build a VM from scratch;
>>    there was simply
>>    > no choice.
>>    >
>>    >
>>    > Right now, though, there are many good choices. People like Rich
>>    Hickey
>>    > (Clojure) and Martin Odersky (Scala) have it right in targeting
>>    the JVM,
>>    > although both projects are also exploring .NET/Mono. If Python
>>    were invented
>>    > today, I imagine it also would start with "Jython," instead of
>>    trying to
>>    > reinvent the wheel (well, reinvent a whole damn car fleet,
>>    really, in terms
>>    > of the work required).
>>    >
>>    >
>>    > One caveat: I think there is room for "meta-VM" projects like
>>    PyPy and LLVM.
>>    > These signify a real progress in architecture, whereas "yet
>>    another dynamic
>>    > VM" does not.
>>    >
>>    >
>>    > -Tal
>>    >
>>    >
>>    > On 07/24/2011 02:56 PM, Jason Rexilius wrote:
>>    >
>>    >> I also have to quote:
>>    >>
>>    >> "rather that, for problems for which shared-memory concurrency is
>>    >> appropriate (read: the valid cases to complain about the GIL),
>>    message
>>    >> passing will not be, because of the marshal/unmarshal overhead
>>    (plus data
>>    >> size/locality ones)."
>>    >>
>>    >>
>>    >> I have to say this is some of the best discussion in quite a
>>    while. Dave's
>>    >> passionate response is great as well as others. I think the
>>    rudeness, or
>>    >> not, is kinda besides the point.
>>    >>
>>    >> There is a valid point to be made about marshal/unmarshal
>>    overhead in
>>    >> situations where data-manipulation-concurrency AND _user
>>    expectation_ or
>>    >> environmental constraints apply.  I think that's why people
>>    have some
>>    >> grounds to be unhappy with the GIL concept (for me its a
>>    concept) in certain
>>    >> circumstances. Tal is dead on in that "scalability" means
>>    different things.
>>    >>
>>    >> Oddly, I'm more engaged in this as an abstract comp sci
>>    question than a
>>    >> specific python question.  The problem set applies across
>>    languages.
>>    >>
>>    >> The question I would raise is if, given that an engineer
>>    understands the
>>    >> problem he is facing, are there both tools in the toolbox?  Is
>>    there an
>>    >> alternative to GIL for the use-cases where it is not the ideal
>>    solution?
>>    >>
>>    >> BTW, I will stand up for IPC as one of the tools in the toolbox
>>    to deal
>>    >> with scale/volume/speed/concurrency problems.
>>    >>
>>    >>
>>    >> On 7/24/11 1:58 PM, Tal Liron wrote:
>>    >>>
>>    >>> I would say that there's truth in both approaches.
>>    "Scalability" means
>>    >>> different things at different levels of scale. A web example: the
>>    >>> architecture of Twitter or Facebook is nothing like the
>>    architecture of
>>    >>> even a large Django site. It's not even the same problem field.
>>    >>>
>>    >>>
>>    >>> A good threading model can be extremely efficient at certain
>>    scales. For
>>    >>> data structures that are mostly read, not written,
>>    synchronization is
>>    >>> not a performance issue, and you get the best throughput
>>    possible in
>>    >>> multicore situations. The truly best scalability would be
>>    achieved by a
>>    >>> combined approach: threading on a single node, message passing
>>    between
>>    >>> nodes. Programming for that, though, is a nightmare (unless
>>    you had a
>>    >>> programming language that makes both approaches transparent)
>>    and so
>>    >>> usually at the large scale the latter approach is chosen. One
>>    >>> significant challenge is to make sure that operations that
>>    MIGHT use the
>>    >>> same data structures are actually performed on the same node,
>>    so that
>>    >>> threading would be put to use.
>>    >>>
>>    >>>
>>    >>> So, what Dave said applies very well to threading, too: "you
>>    still need
>>    >>> to know what you're doing and how to decompose your
>>    application to use
>>    >>> it."
>>    >>>
>>    >>>
>>    >>> Doing concurrency right is hard. Doing message passing right
>>    is hard.
>>    >>> Functional (persistent data structure) languages are hard,
>>    too. Good
>>    >>> thing we're all such awesome geniuses, bursting with
>>    experience and a
>>    >>> desire to learn.
>>    >>>
>>    >>>
>>    >>> -Tal
>>    >>>
>>    >>>
>>    >>> On 07/23/2011 01:40 PM, David Beazley wrote:
>>    >>>
>>    >>>>> "high performance just create multi processes that message" very
>>    >>>>> rarely have
>>    >>>>> I heard IPC and high performance in the same sentence.
>>    >>>>>
>>    >>>>> Alex
>>    >>>>>
>>    >>>> Your youth and inexperience is the only reason would make a
>>    statement
>>    >>>> that ignorant. Go hang out with some people doing Python and
>>    >>>> supercomputing for awhile and report back---you will find
>>    that almost
>>    >>>> significant application is based on message passing (e.g.,
>>    MPI). This
>>    >>>> is because message passing has proven itself to be about the
>>    only sane
>>    >>>> way of scaling applications up to run across thousands to tens of
>>    >>>> thousands of CPU cores.
>>    >>>>
>>    >>>> I speak from some experience as I was writing such software
>>    for large
>>    >>>> Crays, Connection Machines, and other systems when I first
>>    discovered
>>    >>>> Python back in 1996. As early as 1995, our group had done
>>    performance
>>    >>>> experiments comparing threads vs. message passing on some
>>    >>>> multiprocessor SMP systems and found that threads just didn't
>>    scale or
>>    >>>> perform as well as message passing even on machines with as
>>    few as 4
>>    >>>> CPUs. This was all highly optimized C code for numerics (i.e., no
>>    >>>> Python or GIL).
>>    >>>>
>>    >>>> That said, in order to code with message passing, you still
>>    need to
>>    >>>> know what you're doing and how to decompose your application
>>    to use it.
>>    >>>>
>>    >>>> Cheers,
>>    >>>> Dave
>>    >>>>
>>    >>>>
>>    >>>>
>>    >>>>
>>    >>>>
>>    >>>>
>>    >>>>
>>    >>>>
>>    >>>> ______________________________**_________________
>>    >>>> Chicago mailing list
>>    >>>> Chicago at python.org <mailto:Chicago at python.org>
>>
>>    >>>> http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>>    >>>
>>    >>> ______________________________**_________________
>>    >>> Chicago mailing list
>>    >>> Chicago at python.org <mailto:Chicago at python.org>
>>
>>    >>> http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>>    >>
>>    >> ______________________________**_________________
>>    >> Chicago mailing list
>>    >> Chicago at python.org <mailto:Chicago at python.org>
>>
>>    >> http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>>    >
>>    > ______________________________**_________________
>>    > Chicago mailing list
>>    > Chicago at python.org <mailto:Chicago at python.org>
>>
>>    > http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>>    >
>>    ______________________________**_________________
>>    Chicago mailing list
>>    Chicago at python.org <mailto:Chicago at python.org>
>>
>>    http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>>
>>
>>
>>
>> --
>> blogs:
>> http://johnstoner.wordpress.**com/ <http://johnstoner.wordpress.com/>
>> 'In knowledge is power; in  wisdom, humility.'
>>
>>
>> ______________________________**_________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>>
>
> ______________________________**_________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/**mailman/listinfo/chicago<http://mail.python.org/mailman/listinfo/chicago>
>



-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20110724/629d3a5f/attachment-0001.html>


More information about the Chicago mailing list