[Python-ideas] More compact bytecode

Sun Feb 7 02:18:18 EST 2016

On 06.02.16 21:18, Antoine Pitrou wrote:
> It sounds like, by 16-bit opcodes, you mean combine the opcode and the
> argument in
> a single 16-bit word.  But that doesn't solve the issue you want to solve: you
> still have to decode the argument encoded in the 16-bit word.  I don't see
> where the benefit is.

Current code uses 3 read operations:

     opcode = *next_instr++;
     next_instr += 2;
     oparg = (next_instr[-1]<<8) + next_instr[-2];

Even combining the latter two operations in one read operation give as 
10% gain in the microbenchmark (see http://bugs.python.org/issue25823):

     opcode = *next_instr++;
     oparg = *(unsigned short *)next_instr;
     next_instr += 2;

With combining the opcode and the argument in always aligned 16-bit word 
I expect larger gain.

     word = *(unsigned short *)next_instr;
     next_instr += 2;
     opcode = word & 0xff;
     oparg = word >> 8;

> It is generally estimated the overhead of bytecode dispatch and decoding is
> around
> 10-30% for CPython. You cannot hope to eliminate that overhead entirely without
> writing a (JIT or AOT) compiler, so any heroic effort to restructure the current
> opcode space and structure will at best win 5 to 20% on select benchmarks.

This would be awesome result.