[Python-Dev] Wordcode: new regular bytecode using 16-bit units
rymg19 at gmail.com
Wed Apr 13 17:29:05 EDT 2016
What is the value of HAS_ARG going to be now?
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something’s wrong.
On Apr 13, 2016 11:26 AM, "Victor Stinner" <victor.stinner at gmail.com> wrote:
> In the middle of recent discussions about Python performance, it was
> discussed to change the Python bytecode. Serhiy proposed to reuse
> MicroPython short bytecode to reduce the disk space and reduce the
> memory footprint.
> Demur Rumed proposes a different change to use a regular bytecode
> using 16-bit units: an instruction has always one 8-bit argument, it's
> zero if the instruction doesn't have an argument:
> According to benchmarks, it looks faster:
> IMHO it's a nice enhancement: it makes the code simpler. The most
> interesting change is made in Python/ceval.c:
> - if (HAS_ARG(opcode))
> - oparg = NEXTARG();
> + oparg = NEXTARG();
> This code is the very hot loop evaluating Python bytecode. I expect
> that removing a conditional branch here can reduce the CPU branch
> I reviewed first versions of the change, and IMHO it's almost ready to
> be merged. But I would prefer to have a review from a least a second
> core reviewer.
> Can someone please review the change?
> The side effect of wordcode is that arguments in 0..255 now uses 2
> bytes per instruction instead of 3, so it also reduce the size of
> bytecode for the most common case.
> Larger argument, 16-bit argument (0..65,535), now uses 4 bytes instead
> of 3. Arguments are supported up to 32-bit: 24-bit uses 3 units (6
> bytes), 32-bit uses 4 units (8 bytes). MAKE_FUNCTION uses 16-bit
> argument for keyword defaults and 24-bit argument for annotations.
> Other common instruction known to use large argument are jumps for
> bytecode longer than 256 bytes.
> Right now, ceval.c still fetchs opcode and then oparg with two 8-bit
> instructions. Later, we can discuss if it would be possible to ensure
> that the bytecode is always aligned to 16-bit in memory to fetch the
> two bytes using a uint16_t* pointer.
> Maybe we can overallocate 1 byte in codeobject.c and align manually
> the memory block if needed. Or ceval.c should maybe copy the code if
> it's not aligned?
> Raymond Hettinger proposes something like that, but it looks like
> there are concerns about non-aligned memory accesses:
> The cost of non-aligned memory accesses depends on the CPU
> architecture, but it can raise a SIGBUS on some arch (MIPS and
> Python-Dev mailing list
> Python-Dev at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev