On Thu, Dec 19, 2013 at 2:07 AM, Maciej Fijalkowski <fijall@gmail.com>wrote:
On Thu, Dec 19, 2013 at 3:17 AM, Gregory P. Smith <greg@krypto.org> wrote:
On Tue, Dec 17, 2013 at 8:43 AM, Stefan Krah <stefan@bytereef.org>
wrote:
Maciej Fijalkowski <fijall@gmail.com> wrote:
I would like to discuss on the language summit a potential inclusion of cffi[1] into stdlib. This is a project Armin Rigo has been working for a while, with some input from other developers.
I've tried cffi (admittedly only in a toy script) and find it very nice to use.
Here's a comparison (pi benchmark) between wrapping libmpdec using a C-extension (_decimal), cffi and ctypes:
+-------------------------------+----------+----------+---------+ | | _decimal | ctypes | cffi | +===============================+==========+==========+=========+ | cpython-tip (with-system-ffi) | 0.19s | 5.40s | 5.14s | +-------------------------------+----------+----------+---------+ | cpython-2.7 (with-system-ffi) | n/a | 4.46s | 5.18s | +-------------------------------+----------+----------+---------+ | Ubuntu-cpython-2.7 | n/a | 3.63s | - | +-------------------------------+----------+----------+---------+ | pypy-2.2.1-linux64 | n/a | 125.9s | 0.94s | +-------------------------------+----------+----------+---------+ | pypy3-2.1-beta1-linux64 | n/a | 264.9s | 2.93s | +-------------------------------+----------+----------+---------+
I guess the key points are that C-extensions are hard to beat and that cffi performance on pypy-2 is outstanding. Additionally it's worth
noting
that Ubuntu does something in their Python build that we should do, too.
Ubuntu compiles their Python with FDO (feedback directed optimization / profile guided optimization) enabled. All distros should do this if they don't already. It's generally 20% interpreter speedup. Our makefile already supports it but it isn't the default build as it takes a long time given that it needs to compile everything twice and do a profiled benchmark run between compilations.
-gps
Hey Greg.
We found out that this only speedups benchmarks that you tried during profiling and not others, so we disabled it for the default pypy build. Can you provide me with some more detailed study on how it speeds up interpreters in general and CPython in particular?
That's a common concern for profile based builds but it turns out not to matter a whole lot which workloads you choose for the CPython interpreter to collect profiles for a FDO build. I believe ubuntu's packages just use the test suite. In our own tests at work this produced good results. Interpreter loops and other common code paths in the interpreter have a *lot* of low hanging fruit in terms of more optimal code generation. Link time optimization adds additional benefits IF you can get it working (not always easy or reliable right now as Matthias mentions in issue17781). -gps