[Python-ideas] More compact bytecode

Sun Feb 7 14:22:03 EST 2016

On 2016-02-07 07:18, Serhiy Storchaka wrote:
> On 06.02.16 21:18, Antoine Pitrou wrote:
>> It sounds like, by 16-bit opcodes, you mean combine the opcode and the
>> argument in
>> a single 16-bit word.  But that doesn't solve the issue you want to solve: you
>> still have to decode the argument encoded in the 16-bit word.  I don't see
>> where the benefit is.
>
> Current code uses 3 read operations:
>
>       opcode = *next_instr++;
>       next_instr += 2;
>       oparg = (next_instr[-1]<<8) + next_instr[-2];
>
> Even combining the latter two operations in one read operation give as
> 10% gain in the microbenchmark (see http://bugs.python.org/issue25823):
>
>       opcode = *next_instr++;
>       oparg = *(unsigned short *)next_instr;
>       next_instr += 2;
>
The previous code is big-endian, whereas this code's endianness is
processor-dependant.

> With combining the opcode and the argument in always aligned 16-bit word
> I expect larger gain.
>
>       word = *(unsigned short *)next_instr;
>       next_instr += 2;
>       opcode = word & 0xff;
>       oparg = word >> 8;
>
>> It is generally estimated the overhead of bytecode dispatch and decoding is
>> around
>> 10-30% for CPython. You cannot hope to eliminate that overhead entirely without
>> writing a (JIT or AOT) compiler, so any heroic effort to restructure the current
>> opcode space and structure will at best win 5 to 20% on select benchmarks.
>
> This would be awesome result.
>