[Python-ideas] Incorporating something like byteplay into the stdlib

Fri Feb 12 17:27:56 EST 2016

On Feb 12, 2016, at 13:45, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> 
>> On 2016-02-12 4:13 PM, Andrew Barnert wrote:
>> On Feb 12, 2016, at 12:36, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>>> On 2016-02-11 10:58 PM, Andrew Barnert via Python-ideas wrote:
>>>> tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable.
>>> Big -1 on the idea, sorry.
>>> 
>>> CPython's bytecode is the implementation detail of CPython.  PyPy has some opcodes that CPython doesn't have, for example.  Who knows, maybe in CPython 4.0 we won't have code objects at all :)
>>> 
>>> Adding something to the standard library means that it will be supported for years to come.  It means that the code is safe to use.  Which, in turn, guarantees that there will be plenty of code that depends on this new functionality.  At first some of that code will be bytecode optimizers, later someone implements LINQ-like extension, and in no time we lose our freedom to work with opcodes.
>>> 
>>> If this "new functionality" is something that depends on CPython's internals, it will only fracture the ecosystem.  PyPy, or Pyston, or IronPython developers will either have to support byteplay-like stuff (which might be impossible), or explain their users why some libraries don't work on their platform.
>> This sounds like potentially a argument against adding bytecode processors in PEP 511.[^1] But if PEP 511 *does* add bytecode processors, I don't see how my proposal makes things any worse.
> 
> The main (and only?) motivation behind PEP 511 is the optimization of CPython.  Maybe the new APIs will only be exposed at C level.

Have you read PEP 511? It exposes an API for adding bytecode processors, explicitly explains that the reason for this API is to allow people to write new bytecode optimizers in Python, and includes a toy example of a bytecode transformer. I'm not imagining some far-fetched idea that someone might suggest in the future, I'm responding to what's actually written in the PEP.

>> Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason.
> 
> You don't need mutability for introspection.

Of course. When you split an analogy in half and only reply to the first half of it like this, the half-analogy has no content. So what?

>> In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors.
> 
> PEP 492 added a bunch of new opcodes.  Serhiy is exploring an opportunity of adding few more LOAD_CONST_N opcodes.  How would a mutable byteplay-code-like object in the dis module help that?

If someone had written a bytecode processor on top of the dis module, and wanted to update it to take advantage of LOAD_CONST_N, it would be easy to do so--even on a local copy of CPython patched with Serhiy's changes. If they'd instead written it on top of a third-party module, they'd have to wait for that module to be updated (probably after the next major version of Python comes out), or update it locally. Which one of those sounds easiest to you?

> Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough.

That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?

>>   [^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...
> 
> Import hooks (and even AST/parse/compile) is a much more high-level API.  I'm not sure we can compare them to byteplay.

You're responding selectively here. Your argument is that people shouldn't mess with bytecode. If we don't want people to mess with bytecode, we shouldn't expose bytecode to be messed with. But you can write a decorator that sets f.__code__ = types.CodeType(...) with a replaced bytecode string, and all of the details on how to do that are fully documented in the dis and inspect modules. Making it tedious and error-prone is not a good way to discourage something.

Meanwhile, the "low-level" part of this already exists: the dis module lists all the opcodes, disassembles bytecode, represents that disassembled form, etc.