Hi, 2014-06-13 6:07 GMT+02:00 Neil Girdhar <mistersheik@gmail.com>:
I was wondering what work is being done on Python to make it faster.
PyPy is 100% compatible with CPython and it is much faster. Numba is also fast, maybe faster than PyPy in some cases (I read that it can uses a GPU) but it's more specialized to numerical computation.
I understand that cpython is incrementally improved. I'm not sure, but I think that pypy acceleration works by compiling a restricted set of Python. And I think I heard something about Guido working on a different model for accelerating Python. I apologize in advance that I didn't look into these projects in a lot of detail. My number one dream about computer languages is for me to be able to write in a language as easy as Python and have it run as quickly as if it were written.
I started to take notes about how CPython can be made faster: http://haypo-notes.readthedocs.org/faster_cpython.html See for example my section "Why Python is slow?": http://haypo-notes.readthedocs.org/faster_cpython.html#why-python-is-slow In short: because Python is a dynamic language (the code can be modified at runtime, a single variable can store different types, almost everything can be modified at runtime), the compiler cannot do much assumption on the Python code and so it's very hard to emit fast code (bytecode).
I do believe that this is possible (since in theory someone could look at my Python code and port it to C++).
There are projects to compile Python to C++. See for example pythran: http://pythonhosted.org/pythran/ But these projects only support a subset of Python. The C++ language is less dynamic than Python.
What I'm suggesting instead is for every iteration of a "code block", the runtime stochastically decides whether to collect statistics about that iteration. Those statistics include the the time running the block, the time perform attribute accesses including type method lookups and so on. Basically, the runtime is trying to guess the potential savings of optimizing this block.
You should really take a look at PyPy. It implements a *very efficient* tracing JIT. The problem is to not make the program slower when you trace it. PyPy makes some compromises to avoid this overhead, it only optimizes loop with more than N iterations (1000?) for example.
If the block is run many times and the potential savings are large, then stochastically again, the block is promoted to a second-level statistics collection. This level collects statistics about all of the external couplings of the block, like the types and values of the passed-in and returned values.
Sorry, this is not the real technical problem :-) No, the real problem is to detect environment changes, remove the specialized code (optimized for the old environment) and maybe re-optimize the code later. Environment: modules, classes (types), functions, "constants", etc. If anything is modified, the code must be regenerated. A specialized code is a compiled version of your Python code which is based on different assumptions to run faster. For example, if your function calls the builtin function "len", you can make the assumption that the len function returns an int. But if the builtin "len" function is replaced by something else, you must call the new len function. With a JIT, you can detect changes of the envrionment and regenerate optimized functions during the execution of the application. You can for example add a "timestamp" (counter incremented at each change) in dictionaries and check if the timestamp changed. My notes about that: http://haypo-notes.readthedocs.org/faster_cpython.html#learn-types
The saving is that the code block * can be transformed into a faster bytecode, which includes straight assembly instructions in some sections since types or values can now be assumed,
My plan is to add the infrastructure to support specialized code in CPython: - support multiple codes in a single function - each code has an environment to decide if it can be used or not - notify (or at least detect) changes of the environment (notify when the Python code is changed: modules, classes, functions) It should work well for functions, but I don't see yet how to implement these things for instances of classes because you can also override methods in an instance.
* can use data structures that make type or access assumptions (for example a list that always contains ints can use a flattened representation; a large set that is repeatedly having membership checked with many negative results might benefit from an auxiliary bloom filter, etc.)
Again, please see PyPy: it has very efficient data structures. I don't think that such changes can be made in CPython. CPython code is too old, too many users rely on the current implementation: rely on the "C API". There are for example a PyList_GET_ITEM() macro to access directly an item of a list. This macro is not part of the stable API, but I guess that most C modules such macro (depending on C structures).