[Chicago] PyPy

Eric Stein toba at des.truct.org
Tue Jul 26 06:54:38 CEST 2011


Who would you sell chocolate ice cream with chocolate chips to?
Assume someone who doesn't really care for chocolate. Go!

Eric

On 07/24/2011 10:46 PM, Tal Liron wrote:
> For the people recommending PyPy right now, a serious question:
>
>
> Who would you recommend PyPy to? Assume a user or dev who does not
> care about speed benchmarks.
>
>
> On 07/24/2011 10:13 PM, Brian Herman wrote:
>
>> +1 for PYPY
>>
>>
>> Arigatou gozaimasu,
>> (Thank you very much)
>> Brian Herman
>>
>> brianjherman.com <http://brianjherman.com>
>> brianherman at acm.org <mailto:brianherman at acm.org>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sun, Jul 24, 2011 at 7:57 PM, Alex Gaynor <alex.gaynor at gmail.com
>> <mailto:alex.gaynor at gmail.com>> wrote:
>>
>>
>>
>> On Sun, Jul 24, 2011 at 5:38 PM, Tal Liron
>> <tal.liron at threecrickets.com <mailto:tal.liron at threecrickets.com>>
>> wrote:
>>
>> JVM 7 will have some neat features, but they haven't been
>> stabilized yet, and at this point it's mostly experimentation.
>> Fact is, even though JVM 6 has been out for a few years
>> already, many deployments still stick to JVM 5. It does the
>> job, and "upgrades" have their costs, money and otherwise. I
>> choose JVM for my project not because of speed, but because of
>> the maturity of the platform, which includes administration
>> tools, monitoring, security, and several best-in-class 3rd
>> party libraries. It's nice to know that performance is very
>> high up there if I really need it (at which case I just "drop
>> down" to Java, rather than use a dynamic JVM language).
>>
>>
>> The whole Jython codebase could use some help... it's even
>> messier than CPython's, if that's possible. There's a lot of
>> room for optimization, even before igniting JVM 7 shortcuts,
>> though it will surely be at the cost of regressions and
>> stability.Luckily, there's a decent test suite, which makes it
>> easy to experiment for. The Jython community would LOVE help,
>> and it doesn't have to be just in terms of coding. Their
>> recent big project was to move the whole codebase from
>> Subversion to Mercurial. Another big item on the todo list is
>> to get up to date with Python 3. (Jython = Python 2.5
>> formally, though it has quite a few 2.6 additions.)
>>
>>
>> Jython also has some nice collaboration with JRuby, including
>> people who work on both projects. But, what I would make me
>> happier is if there was real code sharing, allowing for a
>> dynamic core that would work well for both projects.
>>
>>
>> Anyway. I guess I'm always confused by what people mean by
>> "faster." What are you trying to code for, exactly? Where is
>> your bottleneck? What is your funding? It's more likely that
>> (although not necessarily) what you really are looking for is
>> "scalability," for which shear computational performance is
>> likely not the real issue. If money is coming, getting more
>> expensive, faster machines may do the trick better than any
>> JVM 7 optimization.
>>
>>
>> If you just want a command line tool that starts fast, JVM is
>> *not* where you want to go. It has notoriously slow startup,
>> for exactly those mechanisms that make it perform so well as
>> it runs.
>>
>>
>> Another way to look at "faster" is as a way to save money.
>> Weird, huh? But consider Facebook's HipHop project. (Sorry
>> that all of my examples are from the web arena; it's where I
>> mostly work these days.) The issue was not that PHP was
>> "slow," it was that when you have 1,000 machines running at
>> 90% CPU, a faster PHP runtime means that you can use 800
>> machines, instead, for the same workload. A few orders of
>> magnitude forward, and savings can be enormous.
>>
>>
>> If you have a project with 1,000 machines running at 90% CPU,
>> please hire me! It may be very worthwhile for you to create a
>> more performant Python runtime (JVM-based or not), and I'd
>> love to be paid to do that. :) And it would also make a lot of
>> irrational Python speed freaks happy.
>>
>>
>> -Tal
>>
>>
>> <minor derail>
>> No offense, but if you want a more performant Python runtime, it's
>> here today: http://speed.pypy.org/, no need to start from scratch.
>> </minor derail>
>>
>> Alex
>>
>>
>>
>> On 07/24/2011 06:18 PM, John Stoner wrote:
>>
>> Jython's not bad. I've used it a lot, and it plays well
>> with lots of Java APIs. Pretty slick, actually. I hear
>> Java 1.7 has some new dynamic features at the JVM level. I
>> always imagined Jython would run a lot faster if it took
>> advantage of them. Tal, do you know if there's any work on
>> that? Googling around a bit I'm not seeing much.
>>
>> On Sun, Jul 24, 2011 at 4:32 PM, Joshua Herman
>> <zitterbewegung at gmail.com
>> <mailto:zitterbewegung at gmail.com>
>> <mailto:zitterbewegung at gmail.com
>> <mailto:zitterbewegung at gmail.com>>> wrote:
>>
>> At least erlang works for the use cases. I wasn't aware
>> that Jython
>> was that powerful I will have to play with it.
>>
>> On Sun, Jul 24, 2011 at 3:46 PM, Tal Liron
>> <tal.liron at threecrickets.com
>> <mailto:tal.liron at threecrickets.com>
>> <mailto:tal.liron at threecrickets.com
>> <mailto:tal.liron at threecrickets.com>>>
>>
>> wrote:
>> > There is an alternative: Jython, which is Python on the
>> JVM, and
>> has no GIL.
>> > It's real, it works, and has a very open community. If
>> you want
>> to do
>> > high-concurrency in Python, it's the way to go. (And it has
>> other advantages
>> > and disadvantages, of course.)
>> >
>> >
>> > I am always a bit frightened by community attempts to
>> create new
>> virtual
>> > machines for favorite languages in order to solve problem X.
>> This shows a
>> > huge under-estimation of what it means to create a
>> robust, reliable,
>> > performative generic platform. Consider how many really
>> reliable
>> versions of
>> > the C standard library out there -- and how many decades
>> they
>> took to
>> > mature, even with thousands of expert eyes poring over
>> the code
>> and testing
>> > it. And this is without duck typing (or ANY typing), data
>> integrity, scoping
>> > (+call/cc), tail recursion, or any other of the other
>> huge (and
>> exciting)
>> > challenges required to run a dynamic language like Python.
>> >
>> >
>> > So, it's almost amusing to see projects like Rubinius or
>> Parrot
>> come to be.
>> > Really? This is the best use of our time and effort? I'm
>> equally
>> impressed
>> > by the ballsiness of Erlang to create a new virtual
>> machine from
>> scratch.
>> >
>> >
>> > But those are rather unique histories. CPython has it's own
>> unique history.
>> > Not many people realize this, but Python is about 6
>> years older
>> than Java,
>> > and the JVM would take another decade before reaching
>> prominence. JavaScript
>> > engines (running in web browsers only) at the time were
>> terrible, and Perl
>> > was entirely interpreted (no VM). So, in fact, CPython was
>> written where
>> > there was no really good platform for dynamic languages. It
>> wasn't a matter
>> > of hubris ("not invented here") to build a VM from scratch;
>> there was simply
>> > no choice.
>> >
>> >
>> > Right now, though, there are many good choices. People
>> like Rich
>> Hickey
>> > (Clojure) and Martin Odersky (Scala) have it right in
>> targeting
>> the JVM,
>> > although both projects are also exploring .NET/Mono. If
>> Python
>> were invented
>> > today, I imagine it also would start with "Jython,"
>> instead of
>> trying to
>> > reinvent the wheel (well, reinvent a whole damn car fleet,
>> really, in terms
>> > of the work required).
>> >
>> >
>> > One caveat: I think there is room for "meta-VM" projects
>> like
>> PyPy and LLVM.
>> > These signify a real progress in architecture, whereas "yet
>> another dynamic
>> > VM" does not.
>> >
>> >
>> > -Tal
>> >
>> >
>> > On 07/24/2011 02:56 PM, Jason Rexilius wrote:
>> >
>> >> I also have to quote:
>> >>
>> >> "rather that, for problems for which shared-memory
>> concurrency is
>> >> appropriate (read: the valid cases to complain about
>> the GIL),
>> message
>> >> passing will not be, because of the marshal/unmarshal
>> overhead
>> (plus data
>> >> size/locality ones)."
>> >>
>> >>
>> >> I have to say this is some of the best discussion in
>> quite a
>> while. Dave's
>> >> passionate response is great as well as others. I think the
>> rudeness, or
>> >> not, is kinda besides the point.
>> >>
>> >> There is a valid point to be made about marshal/unmarshal
>> overhead in
>> >> situations where data-manipulation-concurrency AND _user
>> expectation_ or
>> >> environmental constraints apply. I think that's why people
>> have some
>> >> grounds to be unhappy with the GIL concept (for me its a
>> concept) in certain
>> >> circumstances. Tal is dead on in that "scalability" means
>> different things.
>> >>
>> >> Oddly, I'm more engaged in this as an abstract comp sci
>> question than a
>> >> specific python question. The problem set applies across
>> languages.
>> >>
>> >> The question I would raise is if, given that an engineer
>> understands the
>> >> problem he is facing, are there both tools in the
>> toolbox? Is
>> there an
>> >> alternative to GIL for the use-cases where it is not
>> the ideal
>> solution?
>> >>
>> >> BTW, I will stand up for IPC as one of the tools in the
>> toolbox
>> to deal
>> >> with scale/volume/speed/concurrency problems.
>> >>
>> >>
>> >> On 7/24/11 1:58 PM, Tal Liron wrote:
>> >>>
>> >>> I would say that there's truth in both approaches.
>> "Scalability" means
>> >>> different things at different levels of scale. A web
>> example: the
>> >>> architecture of Twitter or Facebook is nothing like the
>> architecture of
>> >>> even a large Django site. It's not even the same
>> problem field.
>> >>>
>> >>>
>> >>> A good threading model can be extremely efficient at
>> certain
>> scales. For
>> >>> data structures that are mostly read, not written,
>> synchronization is
>> >>> not a performance issue, and you get the best throughput
>> possible in
>> >>> multicore situations. The truly best scalability would be
>> achieved by a
>> >>> combined approach: threading on a single node, message
>> passing
>> between
>> >>> nodes. Programming for that, though, is a nightmare
>> (unless
>> you had a
>> >>> programming language that makes both approaches
>> transparent)
>> and so
>> >>> usually at the large scale the latter approach is
>> chosen. One
>> >>> significant challenge is to make sure that operations that
>> MIGHT use the
>> >>> same data structures are actually performed on the
>> same node,
>> so that
>> >>> threading would be put to use.
>> >>>
>> >>>
>> >>> So, what Dave said applies very well to threading,
>> too: "you
>> still need
>> >>> to know what you're doing and how to decompose your
>> application to use
>> >>> it."
>> >>>
>> >>>
>> >>> Doing concurrency right is hard. Doing message passing
>> right
>> is hard.
>> >>> Functional (persistent data structure) languages are hard,
>> too. Good
>> >>> thing we're all such awesome geniuses, bursting with
>> experience and a
>> >>> desire to learn.
>> >>>
>> >>>
>> >>> -Tal
>> >>>
>> >>>
>> >>> On 07/23/2011 01:40 PM, David Beazley wrote:
>> >>>
>> >>>>> "high performance just create multi processes that
>> message" very
>> >>>>> rarely have
>> >>>>> I heard IPC and high performance in the same sentence.
>> >>>>>
>> >>>>> Alex
>> >>>>>
>> >>>> Your youth and inexperience is the only reason would
>> make a
>> statement
>> >>>> that ignorant. Go hang out with some people doing
>> Python and
>> >>>> supercomputing for awhile and report back---you will find
>> that almost
>> >>>> significant application is based on message passing
>> (e.g.,
>> MPI). This
>> >>>> is because message passing has proven itself to be
>> about the
>> only sane
>> >>>> way of scaling applications up to run across
>> thousands to tens of
>> >>>> thousands of CPU cores.
>> >>>>
>> >>>> I speak from some experience as I was writing such
>> software
>> for large
>> >>>> Crays, Connection Machines, and other systems when I
>> first
>> discovered
>> >>>> Python back in 1996. As early as 1995, our group had done
>> performance
>> >>>> experiments comparing threads vs. message passing on some
>> >>>> multiprocessor SMP systems and found that threads
>> just didn't
>> scale or
>> >>>> perform as well as message passing even on machines
>> with as
>> few as 4
>> >>>> CPUs. This was all highly optimized C code for
>> numerics (i.e., no
>> >>>> Python or GIL).
>> >>>>
>> >>>> That said, in order to code with message passing, you
>> still
>> need to
>> >>>> know what you're doing and how to decompose your
>> application
>> to use it.
>> >>>>
>> >>>> Cheers,
>> >>>> Dave
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Chicago mailing list
>> >>>> Chicago at python.org <mailto:Chicago at python.org>
>> <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>>
>> >>>> http://mail.python.org/mailman/listinfo/chicago
>> >>>
>> >>> _______________________________________________
>> >>> Chicago mailing list
>> >>> Chicago at python.org <mailto:Chicago at python.org>
>> <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>>
>> >>> http://mail.python.org/mailman/listinfo/chicago
>> >>
>> >> _______________________________________________
>> >> Chicago mailing list
>> >> Chicago at python.org <mailto:Chicago at python.org>
>> <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>>
>> >> http://mail.python.org/mailman/listinfo/chicago
>> >
>> > _______________________________________________
>> > Chicago mailing list
>> > Chicago at python.org <mailto:Chicago at python.org>
>> <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>>
>> > http://mail.python.org/mailman/listinfo/chicago
>> >
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org <mailto:Chicago at python.org>
>> <mailto:Chicago at python.org <mailto:Chicago at python.org>>
>>
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>>
>>
>> -- blogs:
>> http://johnstoner.wordpress.com/
>> 'In knowledge is power; in wisdom, humility.'
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org <mailto:Chicago at python.org>
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org <mailto:Chicago at python.org>
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>>
>>
>> -- "I disapprove of what you say, but I will defend to the death your
>> right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
>> "The people's good is the highest law." -- Cicero
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org <mailto:Chicago at python.org>
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago



More information about the Chicago mailing list