Re: [Python-Dev] Python 3 optimizations, continued, continued again...

28 Jan 2012

      Hi,

On Tue, Nov 8, 2011 at 10:36, Benjamin Peterson  wrote:
...
2011/11/8 stefan brunthaler :
...
How does that sound?
I think I can hear real patches and benchmarks most clearly.
I spent the better part of my -20% time on implementing the work as
"suggested". Please find the benchmarks attached to this email, I just
did them on my system (i7-920, Linux 3.0.0-15, GCC 4.6.1). I branched
off the regular 3.3a0 default tip changeset 73977 shortly after your
email. I do not have an official patch yet, but am going to create one
if wanted. Changes to the existing interpreter are minimal, the
biggest chunk is a new interpreter dispatch loop.

Merging dispatch loops eliminates some of my optimizations, but my
inline caching technique enables inlining some functionality, which
results in visible speedups. The code is normalized to the
non-threaded-code version of the CPython interpreter (named
"vanilla"), so that I can reference it to my preceding results. I
anticipate *no* compatibility issues and the interpreter requires less
than 100 KiB of extra memory at run-time. Since my interpreter is
using 215 of a maximum of 255 instructions, there is room for adding
additional derivatives, e.g., for popular Python libraries, too.

Let me know what python-dev thinks of this and have a nice weekend,
--stefan

PS: AFAIR the version without partial stack frame caching also passes
all regression tests modulo the ones that test against specific
bytecodes.