
On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik@gmail.com>:
How is this not just a poorer version of PyPy's optimizations?
This a very good question :-) There are a lot of optimizers in the wild, mostly JIT compilers. The problem is that most of them are specific to numerical computations, and the remaining ones are generic but not widely used. The most advanced and complete fast implementation of Python is obviously PyPy. I didn't heard a lot of deployements with PyPy. For example, PyPy is not used to install OpenStack (a very large project which has a big number of dependencies). I'm not even sure that PyPy is the favorite implementation of Python used to run Django, to give another example of popular Python application.
PyPy is just amazing in term of performances, but for an unknown reason, it didn't replace CPython yet. PyPy has some drawbacks: it only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it has bad performances on the C API and I heard that performances are not as amazing as expected on some applications. PyPy has also a worse startup time and use more memory. IMHO the major issue of Python is the backward compatibility on the C API.
In short, almost all users are stuck at CPython and CPython implements close to 0 optimization (come on, constant folding and dead code elimintation is not what I would call an "optimization" ;-)).
My goal is to fill the hole between CPython (0 optimization) and PyPy (the reference for best performances).
I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html
I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change.
If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff.
I disagree that my proposed changes increase the "language complexity". According to early benchmarks, my changes has a negligible impact on performances. I don't see how adding a read-only __version__ property to dict makes the Python *language* more complex?
It makes it more complex because you're adding a user-facing property. Every little property adds up in the cognitive load of a language. It also means that all of the other Python implementation need to follow suit even if their optimizations work differently. What is the point of making __version__ an exposed property? Why can't it be a hidden variable in CPython's underlying implementation of dict? If some code needs to query __version__ to see if it's changed then CPython should be the one trying to discover this pattern and automatically generate the right code. Ultimately, this is just a piece of a JIT, which is the way this is going to end up. My whole design is based on the idea that my optimizer will be
optimal. You will be free to not use it ;-)
And sorry, I'm not interested to contribute to PyPy.
That's fine, but I think you are probably wasting your time then :) The "hole between CPython and PyPy" disappears as soon as PyPy catches up to CPython 3.5 with numpy, and then all of this work goes with it.
Victor