[Python-ideas] Type Hinting - Performance booster ?

Mon Dec 22 00:38:17 CET 2014

On 22 December 2014 at 05:23, Ludovic Gasc <gmludo at gmail.com> wrote:

> Hi Nick,
>
> Thanks for your answer. I understand primary goal, I'm not completely
> naive on this question: A long time ago, I used a lot Type Hinting with PHP
> 5.
>
> Nevertheless, in Python community, you can find a lot of libraries to
> improve performance based on types handling with different optimization
> strategies  (numba, pypy, pythran, cython, llvm-py, shedskin...).
> To my knowledge, you don't have the same number of libraries to do that
> with another dynamic language.
> It means that in Python community we have this problematic.
>
> It's like with async pattern, in Python you have a plenty of libraries
> (Twisted, eventlet, gevent, stackless, tornado...) and now, with AsyncIO,
> the community should converge on it.
>
> And yes, I understand that it's almost impossible to create a silver
> bullet to improve automagically performance, but, as with my simple dev
> eyes, the common pattern I see with pythran, cython... is the type handling.
> They don't use only this strategy to improve performance, but it's the
> biggest visible part in example codes I've seen.
>
> Guido: " optimizers have been quite successful without type hints"  <=
> Certainly, but instead of to loose time to try to guess the right data
> structure, maybe it could be faster that the developer gives directly what
> he wants.
>
> To be honest, I'm a little bit tired to listen some bias like "Python is
> slow", "not good for performance", "you must use C/Java/Erlang/Go..."
> For me, Python has the right compromise to write quickly readable source
> code and performance possibilities to speed up your code.
>
> More we have primitives in CPython to build performant applications, more
> it will be easier to convince people to use Python.
>
Perhaps the most effective thing anyone could do to make significant
progress in the area of CPython performance is to actually get CodeSpeed
working properly with anything other than PyPy, as automated creation of
clear metrics like that can be incredibly powerful as a motivational tool
(think about the competition on JavaScript benchmarks between different
browser vendors, or the way the PyPy team use speed.pypy.org as a measure
of their success in making new versions faster). Work on speed.python.org
(as a cross-implementation counterpart to speed.pypy.org) was started years
ago, but no leader ever emerged to drive the effort to completion (and even
a funded development effort by the PSF failed to produce a running instance
of the service).

Another possible approach would be to create a JavaScript front end for
PyPy (along the lines of the PyPy-based Topaz interpreter for Ruby, or the
HippyVM interpreter for PHP), and make a serious attempt at displacing V8
at the heart of Node.js. (The Node.js build system for binary extensions is
already written in Python, so why not the core interpreter as well? There's
also the fact that Node.js regularly ends up running on no longer supported
versions of V8, as V8 is written to meet the needs of Chrome, not those of
the server-side JavaScript community).

One key advantage of the latter approach is that the more general purpose
PyPy infrastructure being competitive with the heavily optimised JavaScript
interpreters created by the browser vendors on a set of industry standard
performance benchmarks is much, much stronger evidence of PyPy's raw speed
than being faster than the not-known-for-its-speed CPython interpreter on a
set of benchmarks originally chosen specifically by Google for the Unladen
Swallow project. Even with Topaz being one of the fastest Ruby
interpreters, to the point of Oracle Labs using it as a relative benchmark
for comparison of JRuby's performance in
http://www.slideshare.net/ThomasWuerthinger/graal-truffle-ethdec2013,
that's still relatively weak evidence for raw speed, since Ruby in general
is also not well known for being fast. (Likewise, HippyVM being faster than
Facebook's HHVM is impressive, but vulnerable to the same counter-argument
that people make for Python and Ruby, "If you care about raw speed, why are
you still using PHP?")

Objective benchmarks and real world success stories are the kinds of things
that people find genuinely persuasive - otherwise we're just yet another
programming language community making self-promoting claims on the internet
without adequate supporting evidence. (
http://economics.sas.upenn.edu/~jesusfv/comparison_languages.pdf is an
example of providing good supporting evidence that compares Numba and
Cython, amongst many other alternatives, to the speed of raw C++ and
FORTRAN for evaluation of a particular numeric model - given the benefits
they offer in maintainability relative to the lower level languages, they
fare extremely well on the speed front)

As things stand, we have lots of folks wanting *someone else* to do the
work to counter the inaccurate logic of "CPython-the-implementation tends
to prioritise portability and maintainability over raw speed, therefore
Python-the-language is inherently slow", yet very few volunteering to
actually do the work needed to counter it effectively in a global sense
(rather than within the specific niches currently targeted by the PyPy,
Numba, and Cython development teams - those teams do some extraordinarily
fine work that doesn't get the credit it deserves due to a mindset amongst
many users that only CPython performance counts in cross-language
comparisons).

Regards,
Nick.

P.S. As noted earlier, a profiling and optimising HOWTO in the standard
documentation set would also make a lot of sense as a way of making these
alternatives more discoverable, but again, it needs a volunteer to write it
(or at least an initial draft which then be polished in review on Reitveld).

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20141222/b7c2aae8/attachment.html>