On 3 July 2013 20:06, Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,

For my registervm project (fork of CPython using register-based
bytecode, instead of stack-based bytecode), I implemented a
Instruction.use_stack() method which just checks if the stack is
"used": read the stack, exchange values in the stack (like "ROT"
instruction), push or pop a value.

Instruction.use_stack():
http://hg.python.org/sandbox/registervm/file/ff24dfecc27d/Lib/registervm.py#l546

The method uses a dummy heuristic, just because it was quick to
implement it, and it's enough for my use case.

To answer your question: yes, I would like a opcode_stack_effect() function.

registervm has its own disassembler which creates objects, rather than
just generating a text representation of the bytecode. I'm using
objects because I rewrite most instructions and I need more
information and functions:

* disassembler (raw bytes => list of instructions)
* assembler (list of instructions => raw bytes)
* format an instruction (human readable assembler)
* is_terminal(): last instruction of a block
* is_cond_jump(): the instruction is a conditional jump? hesitate to
move this disassembler to CPython directly. I'm not sure that it would
be useful, its API is maybe too specific to my registervm project.
* is_reg_used(), is_reg_replaced(), is_reg_modified(), etc.: checks on registers
* etc.

Would it be useful to have such high-level API in Python?

I finally committed a longstanding patch to add something like that a while ago for 3.4: http://docs.python.org/dev/library/dis#bytecode-analysis

It's still fairly low level, but already far more programmatically useful than the old disassembler text.

I'm still inclined to push higher level stuff out to external libraries - this patch was mostly about making some of our compiler tests a bit more maintainable, as well as giving third party libraries better building blocks without changing the dis module too much.

To get back to Larry's question, though, I think exposing the stack effects through dis.Instruction would be a good idea (since that will have access to the oparg to calculate the variable effects).

As far as how to expose the data to Python goes, I suggest adding an _opcode C module to back opcode.py and eliminate the manual duplication of the opcode values while you're at it.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia