[Python-Dev] Improving the bytecode

Sun Jun 5 15:16:57 EDT 2016

On 05.06.16 21:24, Raymond Hettinger wrote:
>> On Jun 4, 2016, at 1:08 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> 1. http://bugs.python.org/issue27129
>> Make the bytecode more 16-bit oriented.
>
> I don' think this should be done.  Adding the /2 and *2 just complicates the code and messes with my ability to reason about jumps.
>
> With VM opcodes, there is always a tension between being close to implementation (what byte address are we jumping to) and being high level (what is the word offset).  In this case, I think we should stay with the former because they are primarily used in ceval.c and peephole.c which are close to the implementation.  At the higher level, there isn't any real benefit either (because dis.py already does a nice job of translating the jump targets).
>
> Here is one example of the parts of the diff that cause concern that future maintenance will be made more difficult by the change:
>
> -                j = blocks[j + i + 2] - blocks[i] - 2;
> +                j = (blocks[j * 2 + i + 2] - blocks[i] - 2) / 2;
>
> Reviewing the original line only gives me a mild headache while the second one really makes me want to avert my eyes ;-)

The /2 and *2 are added just because Victor wants to keep f_lineno 
counting bytes. Please look at my first patch. It doesn't contain /2 and 
*2. It even contains much less +2 and -2. For example the above change 
looks as:

-                j = blocks[j + i + 2] - blocks[i] - 2;
+                j = blocks[j + i + 1] - blocks[i] - 1;

Doesn't this give you less headache?

>> 2. http://bugs.python.org/issue27140
>> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys. This optimize the common case and especially helpful for two following issues (creating and calling functions).
>
> This shows promise.
>
> The proposed name BUILD_CONST_KEY_MAP is much more clear than BUILD_MAP_EX.

If you accept this patch, I'll commit it. At least two other issues wait 
this.

>> 5. http://bugs.python.org/issue27127
>> Rework the for loop implementation.
>
> I'm unclear what problem is being solved by requiring that GET_ITER always followed immediately by FOR_ITER.

As I understand, the purpose was to decrease the number of executed 
opcodes. It looks to me that existing patch is not acceptable, because 
there is a reason for using two opcodes in the for loop start. But I 
think that we can use other optimization here. I'll try to write a patch.