[Python-ideas] More compact bytecode

Thu Feb 4 13:26:50 EST 2016

On 2016-02-04 17:28:45, "Andrew Barnert via Python-ideas" 
<python-ideas at python.org> wrote:

>On Thursday, February 4, 2016 8:55 AM, Serhiy Storchaka 
><storchaka at gmail.com> wrote:
>
>>  > On 04.02.16 18:16, Victor Stinner wrote:
>
>
>>>>   2. Use 16-bit opcodes as in WPython.
>>>
>>>   IMHO this is a much more interesting option.
>>>
>>>   It would be great to remove "if (HAS_ARG(opcode)) oparg =
>>  NEXTARG();"
>>>   and always get the argument from the 16-bit opcode. This if() adds
>>>   more work to the CPU which has to flush the pipeline on branch
>>>   misprediction. Usually, it doesn't matter. For this if() is really
>>>   part of the hot path code Python, it's the most important loop 
>>>running
>>>   the bytecode. So any instruction matters here ;-)
>>
>>  Actually in common case this "if" is executed at compile time. But I
>>  expect a benefit from using simple read of 16-bit value instead of 3
>>  reads of 8-bit values.
>
>Only if the 16-bit reads are aligned. Can that be guaranteed somehow?
>
Aren't allocated blocks usually aligned to n-byte boundaries (or 
multiples thereof) anyway, where n is the number of bytes needed to hold 
an address?

>If you change co_code to some type that's like bytes, but an immutable 
>array of shorts instead of bytes, that would certainly do it. And it 
>would be easier to inspect/hack 16-bit bytecodes from Python if you 
>didn't have to look at them as bytes. But that implies creating the new 
>type, exposing it to Python, and changing what co_code returns.
>
>Or it could just be a simple switch when constructing a code object: if 
>the code argument is aligned, use it as-is; if not, copy its contents 
>into an aligned array, build a new bytes around that, and store that 
>instead. (Would that significantly slow down .pyc reads and other 
>marshaling uses, if half of all code objects have to copy their 
>contents?)
>