Incorporating something like byteplay into the stdlib

tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable. Why? ---- The first problem is that the API for bytecode transformers is incomplete. You don't get, e.g., varnames, so you can't even write a transformer that looks up or modifies locals. But that part is easy to fix, so I won't dwell on it. The second problem is that bytecode is just painful to work with. The peephole optimizer deals with this by just punting and returning the original code whenever it sees anything remotely complicated (which I don't think we want to encourage for all bytecode transformers), and it's _still_ pretty hairy code. And it's the kind of code that can easily harbor hard-to-spot and harder-to-debug bugs (line numbers occasionally off on tracebacks, segfaults on about 1/255 programs that do something uncommon, that kind of fun stuff). The compiler is already doing this work. There's no reason every bytecode processor should have to repeat all of it. I'm not so worried about performance here--technically, fixup is worst-case quadratic, but practically I doubt we spend much time on it--but simplicity. Why should everyone have to repeat dozens of lines of complicated code to write a simple 10-line transformer? What? ----- I played around with a few possibilities from fixed-width bytecode and uncompressed lnotab to a public version of the internal assembler structs, but I think the best one is a flat sequence of instructions, with pseudo-instructions for labels and line numbers, and jump targets just references to those label instructions. The compiler, instead of generating and fixing up bytecode and then calling PyCode_Optimize, would generate this structure, call PyCode_Optimize, and then generate and fix up the result. The object should look a lot like dis.Bytecode, which is basically an iterable of Instruction objects with some extra attributes. But not identical to dis.Bytecode. * It needs a C API, and probably a C implementation. * It should be a mutable sequence, not just an iterable. * Instruction objects should be namedtuples. * It should be possible to replace one with a plain (op, arg) tuple (as with tokenizer.Token objects). * It should be possible to return any iterable of such tuples instead of a Bytecode (again like tokenizer.untokenize). * It doesn't need the "low-level" stuff that dis carries around, because that low level doesn't exist yet. For example, a dis.Instruction for a LOAD_CONST has the actual const value in argval, but it also has the co_consts index in arg. We don't need the latter. * Instead of separate opname and opval fields (and opmap and opname to map between them), we should just make the opcodes an enum. (Actually, this one could probably be pulled out into a standalone suggestion for the dis module.) I think we should just modify dis.Bytecode to handle this use case as well as the existing one (which means it would retain things like Instruction.arg and Bytecode.first_line, they'll just be None if the object wasn't created from a live code object--just like the existing Bytecode.current_offset is None if it's created from a code or function instead of a frame or traceback). The reason we want a mutable sequence is that it makes the most sense for processors that do random-access-y stuff. (Also, as Serhiy suggested, if the fixup code does NOP removal, this can be good for some simple processors, too--you can do bytecode[i] = (dis.NOP, None) while in the middle of iterating bytecode.) But for simpler one-pass processors, yielding instructions is really handy: def constify_globals(bytecode: dis.Bytecode): for op, arg in bytecode: if op == dis.LOAD_GLOBAL: yield (dis.LOAD_CONST, eval(arg, globals())) else: yield (op, arg) def eliminate_double_jumps(bytecode: dis.Bytecode): for instr in bytecode: if dis.hasjump(instr.opcode): target = bytecode[instr.argval.offset + 1] if target.opcode in {dis.JUMP_FORWARD, dis.JUMP_ABSOLUTE}: yield (instr.opcode, target.argval) continue yield instr Anyway, I realize this API is still a little vague, but if people don't hate the idea, I'll try to finish up both the design and the proof of concept this weekend. But obviously, any suggestions or holes poked in the idea would be even better before I do that work than after. Byteplay? --------- Once I started building a proof of concept, I realized what I was building is almost exactly the functionality of byteplay, but with a more dis-like interface, except without the to_code method. Of course that method is the hard part--but it's pretty much the same code I already wrote in the compiler; it's just a matter of separating it out and exposing it nicely. Byteplay.to_code does one additional thing the compiler doesn't do: it verifies that the stack effects of the assembled code are balanced. And, while this used to be a huge pain, the 3.4+ version (which leans on the dis module) is a lot simpler. So, maybe that's worth adding to. One thing Byteplay.from_code does that I _don't_ think we want is recursively transforming any code consts into Byteplay objects. (Because the compiler works inside-out; your nested functions are already optimized by the time you're being called.) Anyway, the advantage here would be that import hooks and decorators can use the exact same API as PEP 511 transformers, except they have to call to_code at the end instead of just returning an object and letting the compiler call it for them. (They'll also probably have to recurse manually on (dis.Bytecode(const) for const in co_consts if isinstance(const, types.CodeType)), while PEP 511 transformers won't.) Alternatives ------------ We could just pass code objects (or all the separate pieces, instead of some of them), and then the docs could suggest using byteplay for non-trivial bytecode transformers, and then everyone will just end up using byteplay. So, what's wrong with that? The biggest problem is that, after each new Python release, anyone using a bytecode transformer will have to wait until byteplay is updated before they can update Python. Also, some people are going to try to do it from scratch instead of adding a dependency, and get it wrong, and publish processors that work on most code but in weird edge cases cause a SystemError or even a segfault which will be a nightmare to debug. And it's annoying that anyone who wants to hack on the bytecode representation pretty much has to disable the peephole optimizer. Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.

On 02/12/2016 04:58 AM, Andrew Barnert via Python-ideas wrote:
tl;dr: Why would they, indeed? Couldln't just import a library that does it for them? That's what libraries are for.
Are you sure the API you come up with will be good enough to be immortalized in CPython itself? Why not put it on PyPI, and only look into immortalizing it after it survives some real use (and competition)? Sure, your ideas look great now, but any new API needs some validation.
Why can't Byteplay be fixed? (Or modularized, or rewritten?)
That's a problem common to any library. You always need to wait until your dependencies are ported to a new version of a language. Byteplay looks special because it explicitly depends on an interface with no backwards compatibility guarantees. But once everyone starts writing bytecode transformers, it won't be special any more: *any* bytecode utility library will have the same problem. Do we add them all to force them to follow Python's release cycle?
Also, some people are going to try to do it from scratch instead of adding a dependency, and get it wrong, and publish processors that work on most code but in weird edge cases cause a SystemError or even a segfault which will be a nightmare to debug.
Shame on them. But is this really a reason to include one blessed convenience interface in the language itself?
And it's annoying that anyone who wants to hack on the bytecode representation pretty much has to disable the peephole optimizer.
I'm sure this individual problem can be solved in a less invasive way.
Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.
This looks like an OK solution. It can always be added back.

On Feb 12, 2016, at 02:57, Petr Viktorin <encukou@gmail.com> wrote:
Are you sure the API you come up with will be good enough to be immortalized in CPython itself?
That's exactly why I'm sticking as close as possible to the existing dis module--which is already immortalized in CPython--and leaning on the byteplay module--which has clearly won the competition in the wild--for things that dis doesn't cover.
Why not put it on PyPI, and only look into immortalizing it after it survives some real use (and competition)?
You want me to put a module on PyPI that patches the compiler and the peephole optimizer? I doubt that's even possible, and it would be a very bad idea if it were. If you weren't reading the proposal, just skimming for individual things to disagree with, you might have missed that the main idea is to provide a useful interface for the compiler to pass to PyCode_Optimize, and exposing functions to convert back and forth to that format a la byteplay is a secondary idea that adds the benefits that import hooks and decorators could then use the same interface as PEP 511 optimizers. You can't do the secondary thing without the main thing.
Anyway, the advantage here would be that import hooks and decorators can use the exact same API as PEP 511 transformers, except they have to call to_code at the end instead of just returning an object and letting the compiler call it for them. (They'll also probably have to recurse manually on (dis.Bytecode(const) for const in co_consts if isinstance(const, types.CodeType)), while PEP 511 transformers won't.)
Why can't Byteplay be fixed? (Or modularized, or rewritten?)
If the problem is the interface to the peephole optimizer and PEP 511 optimizers, fixing byteplay doesn't help that. If we did fix the PyCode_Optimize interface, then sure, instead of exposing the same code, we could keep it hidden and separately change byteplay to duplicate the same interface we could have exposed. But why? Also, byteplay is actually a very small module, and the only hard part of it is the part that keeps in sync with the dis module across each new Python version and does the same things backward-compatibly for pre-3.4 versions. There's really nothing to "modularize" there. If 3.6 had an extended dis module and C API as I've suggested, in 3.6+ it could just become a shim around what's already in the stdlib, which would then be useful for backward compat for anyone who wants to write bytecode processing import hooks or decorators that work with 3.5 and 3.6, or maybe even with 2.7 and 3.6. I think that probably _is_ worth doing, but that's not a proposal for stdlib, it's a proposal for byteplay (and only if this proposal--and PEP 511, of course--goes through).
In theory, yes. In practice, when Python 3.6 comes out, every library in my site-packages from 3.5 will continue to work, most of them without even needing a recompile, except for byteplay, and that's been the same for almost every release in Python's history. So that does make it special in practice.
Since byteplay is the only one that's still alive and being updated, and "all of them" is just one, yes, that's exactly what we do. And "forcing it to follow Python's release cycle" is not a problem. The only time there are significant updates to the library is when there's a new version of Python to deal with, or when someone finds a bug in the way it deals with the latest version of Python that wasn't discovered immediately. At any rate, if everyone begins writing bytecode transformers, that makes the existing problem worse, it doesn't solve it.
Yes. Expecting people to update the lnotab, etc., as currently designed is a bad idea, especially given the way things go wrong if they get it wrong. If we're providing an interface that exposes this stuff to normal developers and expects them to deal with it, we should have tools to make it tractable.
And it's annoying that anyone who wants to hack on the bytecode representation pretty much has to disable the peephole optimizer.
I'm sure this individual problem can be solved in a less invasive way.
I'm not. Short of rewriting the peephole optimizer to a new interface, how would you solve it? I think this is a pretty minor problem in the first place (how often do people want to hack on the bytecode representation? and how often do people do so who aren't pretty experienced with Python and CPython?), which is why I tossed it in at the end, after the real problems.
Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.
This looks like an OK solution. It can always be added back.
Well, I assume Victor has an argument for why it's not OK, but I think it would make things simpler.

Hi, 2016-02-12 4:58 GMT+01:00 Andrew Barnert via Python-ideas <python-ideas@python.org>:
tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable.
Hum, it looks like your email is highly coupled to the PEP 511. First of all, I really want to support bytecode transformer because I would like to be able to disable the peephole optimizer. Having the peephole registered in code transformers as AST transformers make the whole PEP more consistent. There is no more special case for peephole optimizer.
I agree that we can enhance the Python stdlib to ease manipulation of bytecode, but I disagree that it's a requirement. It's ok to use an external library (like byteplay) for that.
The second problem is that bytecode is just painful to work with. The peephole optimizer deals with this by just punting and returning the original code whenever it sees anything remotely complicated (which I don't think we want to encourage for all bytecode transformers), and it's _still_ pretty hairy code. And it's the kind of code that can easily harbor hard-to-spot and harder-to-debug bugs (line numbers occasionally off on tracebacks, segfaults on about 1/255 programs that do something uncommon, that kind of fun stuff).
Sorry, I don't understand. Are you writing that the CPython peephole optimizer produces invalid code? Or are you talking about bugs in your own code? I'm not aware of bugs in the peephole optimizer.
The compiler is already doing this work. There's no reason every bytecode processor should have to repeat all of it. I'm not so worried about performance here--technically, fixup is worst-case quadratic, but practically I doubt we spend much time on it--but simplicity. Why should everyone have to repeat dozens of lines of complicated code to write a simple 10-line transformer?
Hum, are you talking about the API proposed in the PEP 511? I understand that you are saying the API only takes a whole code object as input and produces a code object as output. An optimizer usually needs a different structure to be able to modify the code. If you have multiple bytecode transformers, you have to repeat these "disassemble" and "assemble" steps, right? I don't think that we will have plently bytecode optimizers in the wild. Even if two or three major bytecode optimizers become popular, are you sure that we will want to combine them? I expect that a single optimizer implements *all* optimizations. I don't see the point of running multiple optimizers to implement multiple optimizations steps. For example, my fatoptimizer AST optimizer implements multiple steps, but *internally*. It is only called once on the AST. I don't think that performance of importing modules really matters. My PEP 511 is mostly written for compilation ahead of time, to support complex and expensive optimizers. The real problem is to run a script: "python script.py" always has to execute all code transformers. For scripts, I hesitate to simply disable expensive optimizers, or maybe even disable all optimizers. For example, a script can run less than 50 ms, is it worth to spend 10 ms to optimize it to a get speedup of 1 ms? (no) The problem of using a specific format for bytecode rather than a code object is that we will have to maintain it. I'm not sure that all bytecode optimizer want the same internal structures. For some kind of optimizations, a sequential list of instructions is enough. For some other optimizations, you need to split blocks of code to have a representation of the exacty "control flow". I'm not sure that one structure is enough to cover all cases. So I prefer to let optimizers "disassemble" and "assemble" themself the bytecode. Last point, the PEP 511 has to take in account the existing peephole optimizer implement in C. If you really want to use a different structure, you will have to reimplement the peephole optimizer with your new API. Since my target is AST, I'm not really interested by that :-) What do you think?
I played around with a few possibilities from fixed-width bytecode and uncompressed lnotab to a public version of the internal assembler structs, but I think the best one is a flat sequence of instructions, with pseudo-instructions for labels and line numbers, and jump targets just references to those label instructions.
Could you try to reimplement the whole peephole optimizer to see if it benefit of your design? I played with bytecode in the past. At the send, I started to implement optimizations which can be implemented simpler at AST level. Why do you prefer bytecode over AST? Your example of converting globals to constants became trivial to implement using my new ast.Constant node.
* It needs a C API, and probably a C implementation.
I don't like extending the Python C API, it is already very large, we have too many functions :-p A C API is more expensive to maintain than a Python API. I would prefer to continue to play with an external module (hosted on PyPI) to not pay the price of maintenance! By the way, what is the final goal? Do you plan to implement a new ultra optimized bytecode optimizer? If yes, do you plan to integrate it into CPython? If no, I don't think that we have to pay the price of maintenance for such "toy" project. The design of my PEP 511 is to allow to support pluggable and *external* optimizers. I don't think that any code optimizer in the wild is mature enough to enter into CPython directly.
Anyway, I realize this API is still a little vague, (...)
It doesn't fit requirements to put something into the Python stdlib. Usually, we experiment stuff on PyPI, wait until it becomes mature, and then propose to integrate it. It looks like you are talking about creating a new API and directly put it into the stdlib, right? Are you sure that it will not change next 2 years? Not any single minor change?
We could just pass code objects (or all the separate pieces, instead of some of them), and then the docs could suggest using byteplay for non-trivial bytecode transformers, and then everyone will just end up using byteplay.
Again, you need to elaborate your rationale. What are your use cases? Which kind of optimizations do you want to implement?
So, what's wrong with that? The biggest problem is that, after each new Python release, anyone using a bytecode transformer will have to wait until byteplay is updated before they can update Python.
Why not contributing to byteplay to support the next CPython release? I don't understand your problem here.
Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.
I want to plug the existing peephole optimizer into my PEP 511 since it is an obvious *code* transformer. Even if changes are minor, some users want to disable it because it really changes the code. It would help code coverage for example. Victor

On Feb 12, 2016, at 04:05, Victor Stinner <victor.stinner@gmail.com> wrote:
Very much so. The basic idea could be done without PEP 511, but there wouldn't be any urgency to doing it if the only client is the peephole optimizer...
OK. But if the peephole optimizer will almost always be the only bytecode processor present, with everything else being AST processors, then it already is special, and making it more generic in a way that still doesn't encompass anything else doesn't change that, it just adds complexity. Personally, I think there probably are useful optimizations that can only be done by bytecode. But the fact that nobody has come up with any compelling examples (while there are lots of compelling examples for AST processors) might imply that we should put it off until someone needs it.
You're either answering out of order, or you missed the point of this paragraph. Your API passes only the bytecode string, consts, lnotab, and global names. There is no way to write a bytecode processor that looks up locals with that interface, because you didn't pass the locals. As I said, that's easy to fix--we could pass all 12 or so necessary parameters, or wrap them up with the 5 or so unnecessary ones in a code object one step earlier in the process and pass that, etc. But that just gets us to the second problem, which is harder to fix.
No. As I said, the peephole optimizer deals with this by punting and returning the original code whenever things get remotely complicated (e.g., long functions). I don't think we want third-party optimizers to similarly punt on anything complicated. So, they _will_ have bugs like this, unless there's some way of making it tractable for them.
Well, the API currently doesn't even take a whole code object--but other than that, yes, that's exactly what I'm saying.
You seem to be focusing on the performance issue here, while I think that's a minor consideration at best. The problem (probably) isn't that it's too costly for the CPU to run the disassembly and assembly code multiple times, but that it's too costly for the Python community to write and maintain disassembly and assembly code multiple times. I suppose if you're expecting that everyone experimenting with a new optimizer will do so by writing it as a patch to one of the three major optimizer projects rather than a standalone project, that might still solve much of the real problem. But do you really think (a) that's the way things should go, and (b) that's the way things probably _will_ go?
I didn't think the performance of optimizers would matter _at all_, which is why I was most focused on the simplicity and DRY argument. If it really is an issue for startup time for small scripts, then repeating the same disassemble/assemble steps multiple times may make that issue worse, but I don't think we'd know that without profiling. So I still think this is probably the least important of my arguments for the idea.
The problem of using a specific format for bytecode rather than a code object is that we will have to maintain it.
We're already maintaining such a format in the stdlib today, and have been for years: the dis module. The problem is that it's an immutable format, and has only a disassembler but no assembler, and no way to pass code into the assembler we already have under the covers. My proposal is essentially just to fix that. Everything else is just gravy. For example, for very little extra work, we could make the mutable dis module replace _all_ need for byteplay, so I think we should do that very little extra work--but if we don't, the core idea is still useful on its own.
Even then, it's far easier to build a tree of blocks from the dis structure than from the bytecode-and-lnotab-and-arrays structure--and even more so to do the assembly and fixups if you're just assembling the dis structure rather than the bytecode-and-etc. structure. (And, if optimizer performance is really an issue--again, I doubt it is, but just in case--it should also be much less expensive to chain your third-party optimizer and the peephole optimizer if you don't need to do the fixups in between.)
Yes. But that's more part of the motivation than a problem. From my exploratory steps, rewriting the peephole optimizer around the dis API is less work than almost any nontrivial change to the peephole optimizer as-is (far less work than changing it to understand wordcode, which I was playing with last weekend). Also, I'm curious if there would be any measurable performance benefit from the peephole optimizer not punting on large functions. No huge benefits here, but as long as someone's willing to do the work--and I am willing, unless Serhiy (who definitely knows CPython better than me, and seems just as motivated here) wants to do it first.
Since my target is AST, I'm not really interested by that :-)
And that brings us back to the alternative idea of just not dealing with bytecode in PEP 511.
Yes, if I work on this over the weekend, reimplementing the peephole optimizer will definitely be part of the PoC. After all, we don't have a large code base of other bytecode optimizers used in the wild to play with yet. :)
I don't prefer bytecode over AST. As I've said multiple times, including in the same email you're responding to, I think 90% of the optimizations you could do in bytecode could instead be done, usually more simply (even with something like byteplay), at the AST level. The only question is whether that last 10% matters.
Imagine you hadn't thought of that optimization (not likely, since it's the first example everyone uses, but you know what I mean), and we'd released Python 3.6 with PEP 511 and without ast.Constant, and now I thought of it. How would I implement it in a third-party optimizer? I can't patch the compiler to add a new AST node. (Nobody's going to install my optimizer if the steps are "get the CPython source, apply this patch, rebuild with the same config as your existing compiler, ...") And there's no way to implement it with the AST with the existing nodes. So I'd have to write it as a bytecode transformer. Of course once I released that to PyPI and everyone loved it and we had benchmarks to back it up, then I could come back and say, "We should add an ast.Constant node in Python 3.7, because that would let me rewrite my global optimizer as a simpler AST processor." And a few years later Python 3.6 would stop being relevant enough to keep maintaining the bytecode version. Meanwhile, here's a completely different example: the Python compiler emits a lot of jumps to jump instructions. The peephole optimizer takes care of some, but not all, of these. Maybe with an ast.Goto node, or extra fields in the various for/try/etc. nodes that don't get generated by parse but can be added by an AST processor, etc., they could all be handled at the AST level--but I'm not sure about that, and I think it would be significantly more complicated than doing it at the bytecode level. So, it's still one of the 10% cases. Again, I think it's possible that the Python community could live without that 10%. If the globals idea had to wait a year and a half for a new AST node to be added, and the double-jump fixes in the existing peephole optimizer were all we ever had, I think CPython would do just fine. (In fact, I think the whole mania for micro-optimizing CPython is mostly misdirected energy in the first place... But I could be wrong about that, and if other people think it's useful, I think it's fun.:)) Which is why I think maybe removing bytecode processors from PEP 511 is a real alternative. It's only if we think we _do_ need bytecode processors that we need a way to write them.
Well, if PEP 511 only supports optimizers written in Python (or written in C, but using the Python API), then the only real need for the C API is the peephole optimizer. So (following my own advice about not overgeneralizing a special case, which I should have thought of before...), I think you're right here. Leave the dis API as pure Python, and keep anything that has to be done in C private to the compiler (maybe shared with the peephole optimizer if necessary, but not exposed as a public and documented C API). So, scratch that line--which makes the proposed change a lot smaller.
I would prefer to continue to play with an external module (hosted on PyPI) to not pay the price of maintenance!
The dis module is already in the stdlib. The byteplay module is a 8 years old in its current form, even older in its original form, and it's become abundantly clear that the only problem with maintaining byteplay is syncing up with changes in dis and the rest of CPython. (The dis module, on the other hand, actually _does_ change, far more often and more substantially than byteplay, and it fits into the CPython release schedule pretty nicely.)
I don't think it's a "toy" project any more than the existing dis module is. At any rate, the end goals are, in rapidly-descending order of importance: 1. Make it feasible for people to write bytecode transformers. 2. Encourage experimentation by allowing people to move mostly the same code between decorators and PEP 511 optimizers and even built in to CPython. 3. Make the compiler a little simpler, and the peephole optimizer a lot simpler. 4. Possibly improve the performance of the optimization step, and possibly improve the performance benefits of the peephole optimizer (by having it not punt on complicated functions). 5. Enable experimentation on low-level parts of CPython (e.g., the only hard part of the wordcode patch in 3.5 is the peephole optimizer--if that went away, we could have more experiments in that vein). 6. Remove the maintenance headaches for byteplay and modules that depend on it. As I said, the importance decreases rapidly, and if we decide that bytecode optimizers aren't that important, and leave them out of PEP 511, that pretty much knocks out the top 2, which I think is more than enough to kill my proposal.
I'm talking about making the smallest possible changes to the _existing_ dis API that's _already_ in the stdlib, and basing those changes on a very mature third-party library that's long since defeated all of its competition. Meanwhile, "just put it on PyPI" isn't an option. A third-party module could monkeypatch dis, but it can't change the order in which the builtin compiler does the assembly, optimizer, and fixup steps, or change the interface to PEP 511 processors and the peephole optimizer. As I said, there _is_ the alternative of making a smaller change (like handing complete code objects to processors and then recommending byteplay), which solves not all but some of the problems for less work, and there's also the alternative of just dropping the bytecode processor API, which means most of the problems don't need to be solved, for no work.
Any kind of optimizations that need to work on bytecode. My rationale for that is that PEP 511 has a rationale for such optimizations. If that rationale isn't good enough, PEP 511 should only handle AST processors, in which case most of the problem I'm trying to solve won't exist in the first place.
So, what's wrong with that? The biggest problem is that, after each new Python release, anyone using a bytecode transformer will have to wait until byteplay is updated before they can update Python.
Why not contributing to byteplay to support the next CPython release?
I've done that before. So have other people. The problem is that someone has to notice that something about CPython bytecode and/or the dis module has changed, and figure out how byteplay needs to change to handle that, and then figure out how to make the change backward-compatibly. If it were integrated in the dis module, then every change that anyone makes would automatically be handled, because you already have to update the dis module (which is almost always very simple--and would continue to be so). And it would be the person who best understands the change making the update, instead of someone who just discovered that 3.7 makes his code raise a SystemError if run through a bytecode processor that he's been using for the last two years. Essentially, integrating this into the existing dis module means we track changes for free, and it's hard to beat free.
I don't understand your problem here.
I think that's because you've never tried to write a non-trivial bytecode processor.
If it really is a special case that needs to be handled, don't over generalize it to cover other cases that you're never going to need. I think we might want those other cases. If so, we need to make them writable. If not, we shouldn't enable them half-way.

Hi, I understand that you have 3 major points: (1) byteplay lags behind CPython, it's difficult to maintain it (2) you want to integrate the code features of byteplay into the dis module (3) you want to use a new API of the dis module in the PEP 511 for bytecode transformers For the point (1), it may be fixed by the point (2). Otherwise, I'm not interested to add byteplay into CPython. I prefer to not promote too much the usage of bytecode transformers, since the bytecode is very low-level: it depends on the Python minor version and is not portable accross implementations of Python. We want to be free to modify deeply the bytecode in CPython. Yury explained that before than me ;-) IMHO the point (2) is just fine. Go ahead! I'm opposed to the point (3) because it would couple too much the exact implementation of bytecodes to the PEP 511 API. I tried to write an API which can be implemented by all Python implementations, not only CPython. See: https://www.python.org/dev/peps/pep-0511/#other-python-implementations IMHO a "black-box" bytecode transformer API is the best we can do to support all Python implementations. By the way, (3) requires to reimplement the dis module in C to bootstrap Python. IMHO it will be boring to write the C code, and much more annoying to maintain it. So quickly, we will refuse any enhancement on the API. I don't think that you want that, since you want to experiment new things, right? It is already to build the API you described *on top of the PEP 511*. Such API would probably be specific to CPython and to byteplay (or dis module if you enhance it). Example: --- import dis import sys # PEP 511 code transformer class ByteplayRegistry: name = "bytecode" def __init__(self): self._transformers = [] def register(self, name, transformer): # FIXME: update self.name? self._transformers.append(transformer) def disassemble(self, code): # FIXME: optional support for byteplay? return dis.dis(code) def assemble(self, asm, code): # FIXME: implement assembler return ... def code_transformer(self, code, context): # disassemble() and assemble() is only called once for all transformers asm = self.disassemble(code) for transformer in self._transformers: asm = transformer(asm, code, context) return self.assemble(asm, code) def global_to_const(asm, code, context): # FIXME: implement optimization return asm byteplay_registry = ByteplayRegistry() byteplay_registry.register("global_to_const", global_to_const) sys.set_code_transformers([byteplay_registry]) --- 2016-02-12 20:16 GMT+01:00 Andrew Barnert <abarnert@yahoo.com>:
Your API passes only the bytecode string, consts, lnotab, and global names. There is no way to write a bytecode processor that looks up locals with that interface, because you didn't pass the locals.
Oh, it looks like you are referring to an old version of the PEP 511 which passed 5 parameters to code_transformer(). The latest version now pass a whole code object to the transformer, and the transformer must return a new code object: https://www.python.org/dev/peps/pep-0511/#code-transformer-method I just saw that I forgot to update the example of bytecode transformer. It's now fixed. https://www.python.org/dev/peps/pep-0511/#bytecode-transformer I also updated the implementation: https://bugs.python.org/issue26145 Victor

On 2016-02-11 10:58 PM, Andrew Barnert via Python-ideas wrote:
tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable.
Big -1 on the idea, sorry. CPython's bytecode is the implementation detail of CPython. PyPy has some opcodes that CPython doesn't have, for example. Who knows, maybe in CPython 4.0 we won't have code objects at all :) Adding something to the standard library means that it will be supported for years to come. It means that the code is safe to use. Which, in turn, guarantees that there will be plenty of code that depends on this new functionality. At first some of that code will be bytecode optimizers, later someone implements LINQ-like extension, and in no time we lose our freedom to work with opcodes. If this "new functionality" is something that depends on CPython's internals, it will only fracture the ecosystem. PyPy, or Pyston, or IronPython developers will either have to support byteplay-like stuff (which might be impossible), or explain their users why some libraries don't work on their platform. Yury

On Feb 12, 2016, at 12:36, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This sounds like potentially a argument against adding bytecode processors in PEP 511.[^1] But if PEP 511 *does* add bytecode processors, I don't see how my proposal makes things any worse. Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason. In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors. [^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...

On 2016-02-12 4:13 PM, Andrew Barnert wrote:
The main (and only?) motivation behind PEP 511 is the optimization of CPython. Maybe the new APIs will only be exposed at C level.
Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason.
You don't need mutability for introspection.
In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors.
PEP 492 added a bunch of new opcodes. Serhiy is exploring an opportunity of adding few more LOAD_CONST_N opcodes. How would a mutable byteplay-code-like object in the dis module help that? Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough. FWIW I have lots of experience working with bytecodes. For example, I have code in production systems in which decorators patch functions' code objects: they guard 'yield' expressions in 'finally' blocks with some function calls. So far that code had only caused me maintenance pain. It's the only code that I have to upgrade each time a new version of Python is released.
[^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...
Import hooks (and even AST/parse/compile) is a much more high-level API. I'm not sure we can compare them to byteplay. Yury

On 2016-02-12 4:45 PM, Yury Selivanov wrote:
The key point here is not the API of your mutable code-object-abstraction. The problem is that bytecode modifiers rely on: 1. existence of certain bytecodes; 2. knowing their precise behaviour and side-effects; 3. matching/patching/analyzing exact sequences of bytecodes. High-level abstractions won't help with the above. Say we want to remove a few opcodes in favour of adding a few new ones. If we do that, most of code optimizers will break. That's why our low-level peephole optimizer is private - we can update it ourselves when we need it. It's completely in our control. Also, AFAIK, FAT Python analyzes/transforms AST. I'm not sure how byteplay could help FAT Python specifically. Yury

On Feb 12, 2016, at 13:57, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Also, AFAIK, FAT Python analyzes/transforms AST. I'm not sure how byteplay could help FAT Python specifically.
So far, all three people who've responded have acted like I invented the idea that PEP 511 might include bytecode optimizers. Even Victor, who wrote the PEP. Am I going crazy here? I'm looking right at http://www.python.org/dev/peps/pep-0511/. It has a "Usage 4" section that has a rationale for why we should allow writing bytecode optimizers in Python, and an example of something that can't be done by an AST optimizer. It has a "code_transformer() method" section, showing the API designed for writing those optimizers. It has an "API to get/set code transformers" section that explains when and how those transformers get run. It has a "Bytecode transformer" section that gives a toy example. Assuming I'm not imagining all that, why are people demanding that I provide a rationale for why we should add bytecode optimizers, or telling me that adding bytecode optimizers isn't going to help FAT Python, etc.? My proposal is that if we add bytecode optimizers, we should make it possible to write them. I don't need to justify that "if". If nobody things we should add bytecode optimizers, not even the author of the PEP that suggests them, then the answer is simple: just remove them from the PEP, and then my proposal becomes void.

On 12.02.2016 23:55, Ethan Furman wrote:
+1 from me. Despite all of Yury's concerns, I find that step logical and not overly problematic. Yury seems to be anxious about bytecode optimizers being used as one of the first-used features. However, without some proper docs this kind of feature is useless anyway. Thus, the docs just need to make is explicit what use-cases are supported (research, experimentation, no production, etc.). Furthermore, after you've gained experience with Python, you already know (aka feel) that you cannot solve most of your (production) problems by tweaking bytecode. I, moreover, don't agree with Yury about the number of potential optimizers. Sure, in the first years there will only be a few but long-term the Python ecosystem can only benefit from competition among bytecode optimizers. There are a lot of smart people out there who have some time to spare (as can be seen from the current dynamic debate about optimizing CPython). Best, Sven

On 2016-02-12 5:38 PM, Andrew Barnert wrote:
Assuming I'm not imagining all that, why are people demanding that I provide a rationale for why we should add bytecode optimizers, or telling me that adding bytecode optimizers isn't going to help FAT Python, etc.?
My proposal is that if we add bytecode optimizers, we should make it possible to write them. I don't need to justify that "if". If nobody things we should add bytecode optimizers, not even the author of the PEP that suggests them, then the answer is simple: just remove them from the PEP, and then my proposal becomes void.
Perhaps I and other people don't understand the "if we add bytecode optimizers, we should make it possible to write them" part. I don't expect to see more than 1 or 2 optimizers out there. Writing a bytecode optimizer that can yield significant speedups is an extremely challenging task. Adding a lot of new functionality to the stdlib *just* for those few optimizers doesn't make a lot of sense. Moreover, having such tools in the stdlib, might cause people to start using them for things other than optimization -- something I don't like at all. We already have import hooks, AST etc for most of such purposes. Yury

On Feb 12, 2016, at 13:45, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Have you read PEP 511? It exposes an API for adding bytecode processors, explicitly explains that the reason for this API is to allow people to write new bytecode optimizers in Python, and includes a toy example of a bytecode transformer. I'm not imagining some far-fetched idea that someone might suggest in the future, I'm responding to what's actually written in the PEP.
Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason.
You don't need mutability for introspection.
Of course. When you split an analogy in half and only reply to the first half of it like this, the half-analogy has no content. So what?
In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors.
PEP 492 added a bunch of new opcodes. Serhiy is exploring an opportunity of adding few more LOAD_CONST_N opcodes. How would a mutable byteplay-code-like object in the dis module help that?
If someone had written a bytecode processor on top of the dis module, and wanted to update it to take advantage of LOAD_CONST_N, it would be easy to do so--even on a local copy of CPython patched with Serhiy's changes. If they'd instead written it on top of a third-party module, they'd have to wait for that module to be updated (probably after the next major version of Python comes out), or update it locally. Which one of those sounds easiest to you?
Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough.
That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?
[^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...
Import hooks (and even AST/parse/compile) is a much more high-level API. I'm not sure we can compare them to byteplay.
You're responding selectively here. Your argument is that people shouldn't mess with bytecode. If we don't want people to mess with bytecode, we shouldn't expose bytecode to be messed with. But you can write a decorator that sets f.__code__ = types.CodeType(...) with a replaced bytecode string, and all of the details on how to do that are fully documented in the dis and inspect modules. Making it tedious and error-prone is not a good way to discourage something. Meanwhile, the "low-level" part of this already exists: the dis module lists all the opcodes, disassembles bytecode, represents that disassembled form, etc.

On 2016-02-12 5:27 PM, Andrew Barnert wrote:
I guess I read an earlier version which was focused only on AST transformations. Maybe PEP 511 should be focused just on just one thing.
Sorry, somehow I failed to read that paragraph in one piece. My bad.
My point (which is the *key* point) is that if we decide to have only LOAD_CONST_N opcodes and remove plain old LOAD_CONST -- all optimizers will break, no matter what library they use. That's just a sad reality of working on the bytecode level. For instance, PEP 492 split WITH_CLEANUP opcode into WITH_CLEANUP_START and WITH_CLEANUP_FINISH. *Any* bytecode manipulation code that expected to see WITH_CLEANUP after SETUP_WITH *was* broken. In short: I don't want to add more stuff to CPython that can make it harder for us to modify its low-level internals.
Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough. That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?
function.__code__ exists and mutable regardless of PEP 511 and byteplay :) Let's not add it to the mix. You're right, I guess this is a common argument for both PEP511's code_transformer and a byteplay in stdlib.
Although function.__code__ is mutable, almost nobody actually mutates it. We have dis module primarily for introspection and research purposes, view it as a handy tool to see how CPython really works. I'm OK if PEP 511 adds some AST transformation hooks (because AST is a higher-level abstraction). Adding code-object transformation hooks and a library to mutate (or produce new) code objects seems very wrong to me. Yury

On 2016-02-12 5:27 PM, Andrew Barnert wrote:
I guess I read an earlier version which was focused only on AST transformations. Maybe PEP 511 should be focused just on just one thing.
Sorry, somehow I failed to read that paragraph in one piece. My bad.
My point (which is the *key* point) is that if we decide to have only LOAD_CONST_N opcodes and remove plain old LOAD_CONST -- all optimizers will break, no matter what library they use. That's just a sad reality of working on the bytecode level. For instance, PEP 492 split WITH_CLEANUP opcode into WITH_CLEANUP_START and WITH_CLEANUP_FINISH. *Any* bytecode manipulation code that expected to see WITH_CLEANUP after SETUP_WITH *was* broken. In short: I don't want to add more stuff to CPython that can make it harder for us to modify its low-level internals.
Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough. That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?
function.__code__ exists and mutable regardless of PEP 511 and byteplay :) Let's not add it to the mix. You're right, I guess this is a common argument for both PEP511's code_transformer and a byteplay in stdlib.
Although function.__code__ is mutable, almost nobody actually mutates it. We have dis module primarily for introspection and research purposes, view it as a handy tool to see how CPython really works. I'm OK if PEP 511 adds some AST transformation hooks (because AST is a higher-level abstraction). Adding code-object transformation hooks and a library to mutate (or produce new) code objects seems very wrong to me. Yury

On 02/12/2016 04:58 AM, Andrew Barnert via Python-ideas wrote:
tl;dr: Why would they, indeed? Couldln't just import a library that does it for them? That's what libraries are for.
Are you sure the API you come up with will be good enough to be immortalized in CPython itself? Why not put it on PyPI, and only look into immortalizing it after it survives some real use (and competition)? Sure, your ideas look great now, but any new API needs some validation.
Why can't Byteplay be fixed? (Or modularized, or rewritten?)
That's a problem common to any library. You always need to wait until your dependencies are ported to a new version of a language. Byteplay looks special because it explicitly depends on an interface with no backwards compatibility guarantees. But once everyone starts writing bytecode transformers, it won't be special any more: *any* bytecode utility library will have the same problem. Do we add them all to force them to follow Python's release cycle?
Also, some people are going to try to do it from scratch instead of adding a dependency, and get it wrong, and publish processors that work on most code but in weird edge cases cause a SystemError or even a segfault which will be a nightmare to debug.
Shame on them. But is this really a reason to include one blessed convenience interface in the language itself?
And it's annoying that anyone who wants to hack on the bytecode representation pretty much has to disable the peephole optimizer.
I'm sure this individual problem can be solved in a less invasive way.
Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.
This looks like an OK solution. It can always be added back.

On Feb 12, 2016, at 02:57, Petr Viktorin <encukou@gmail.com> wrote:
Are you sure the API you come up with will be good enough to be immortalized in CPython itself?
That's exactly why I'm sticking as close as possible to the existing dis module--which is already immortalized in CPython--and leaning on the byteplay module--which has clearly won the competition in the wild--for things that dis doesn't cover.
Why not put it on PyPI, and only look into immortalizing it after it survives some real use (and competition)?
You want me to put a module on PyPI that patches the compiler and the peephole optimizer? I doubt that's even possible, and it would be a very bad idea if it were. If you weren't reading the proposal, just skimming for individual things to disagree with, you might have missed that the main idea is to provide a useful interface for the compiler to pass to PyCode_Optimize, and exposing functions to convert back and forth to that format a la byteplay is a secondary idea that adds the benefits that import hooks and decorators could then use the same interface as PEP 511 optimizers. You can't do the secondary thing without the main thing.
Anyway, the advantage here would be that import hooks and decorators can use the exact same API as PEP 511 transformers, except they have to call to_code at the end instead of just returning an object and letting the compiler call it for them. (They'll also probably have to recurse manually on (dis.Bytecode(const) for const in co_consts if isinstance(const, types.CodeType)), while PEP 511 transformers won't.)
Why can't Byteplay be fixed? (Or modularized, or rewritten?)
If the problem is the interface to the peephole optimizer and PEP 511 optimizers, fixing byteplay doesn't help that. If we did fix the PyCode_Optimize interface, then sure, instead of exposing the same code, we could keep it hidden and separately change byteplay to duplicate the same interface we could have exposed. But why? Also, byteplay is actually a very small module, and the only hard part of it is the part that keeps in sync with the dis module across each new Python version and does the same things backward-compatibly for pre-3.4 versions. There's really nothing to "modularize" there. If 3.6 had an extended dis module and C API as I've suggested, in 3.6+ it could just become a shim around what's already in the stdlib, which would then be useful for backward compat for anyone who wants to write bytecode processing import hooks or decorators that work with 3.5 and 3.6, or maybe even with 2.7 and 3.6. I think that probably _is_ worth doing, but that's not a proposal for stdlib, it's a proposal for byteplay (and only if this proposal--and PEP 511, of course--goes through).
In theory, yes. In practice, when Python 3.6 comes out, every library in my site-packages from 3.5 will continue to work, most of them without even needing a recompile, except for byteplay, and that's been the same for almost every release in Python's history. So that does make it special in practice.
Since byteplay is the only one that's still alive and being updated, and "all of them" is just one, yes, that's exactly what we do. And "forcing it to follow Python's release cycle" is not a problem. The only time there are significant updates to the library is when there's a new version of Python to deal with, or when someone finds a bug in the way it deals with the latest version of Python that wasn't discovered immediately. At any rate, if everyone begins writing bytecode transformers, that makes the existing problem worse, it doesn't solve it.
Yes. Expecting people to update the lnotab, etc., as currently designed is a bad idea, especially given the way things go wrong if they get it wrong. If we're providing an interface that exposes this stuff to normal developers and expects them to deal with it, we should have tools to make it tractable.
And it's annoying that anyone who wants to hack on the bytecode representation pretty much has to disable the peephole optimizer.
I'm sure this individual problem can be solved in a less invasive way.
I'm not. Short of rewriting the peephole optimizer to a new interface, how would you solve it? I think this is a pretty minor problem in the first place (how often do people want to hack on the bytecode representation? and how often do people do so who aren't pretty experienced with Python and CPython?), which is why I tossed it in at the end, after the real problems.
Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.
This looks like an OK solution. It can always be added back.
Well, I assume Victor has an argument for why it's not OK, but I think it would make things simpler.

Hi, 2016-02-12 4:58 GMT+01:00 Andrew Barnert via Python-ideas <python-ideas@python.org>:
tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable.
Hum, it looks like your email is highly coupled to the PEP 511. First of all, I really want to support bytecode transformer because I would like to be able to disable the peephole optimizer. Having the peephole registered in code transformers as AST transformers make the whole PEP more consistent. There is no more special case for peephole optimizer.
I agree that we can enhance the Python stdlib to ease manipulation of bytecode, but I disagree that it's a requirement. It's ok to use an external library (like byteplay) for that.
The second problem is that bytecode is just painful to work with. The peephole optimizer deals with this by just punting and returning the original code whenever it sees anything remotely complicated (which I don't think we want to encourage for all bytecode transformers), and it's _still_ pretty hairy code. And it's the kind of code that can easily harbor hard-to-spot and harder-to-debug bugs (line numbers occasionally off on tracebacks, segfaults on about 1/255 programs that do something uncommon, that kind of fun stuff).
Sorry, I don't understand. Are you writing that the CPython peephole optimizer produces invalid code? Or are you talking about bugs in your own code? I'm not aware of bugs in the peephole optimizer.
The compiler is already doing this work. There's no reason every bytecode processor should have to repeat all of it. I'm not so worried about performance here--technically, fixup is worst-case quadratic, but practically I doubt we spend much time on it--but simplicity. Why should everyone have to repeat dozens of lines of complicated code to write a simple 10-line transformer?
Hum, are you talking about the API proposed in the PEP 511? I understand that you are saying the API only takes a whole code object as input and produces a code object as output. An optimizer usually needs a different structure to be able to modify the code. If you have multiple bytecode transformers, you have to repeat these "disassemble" and "assemble" steps, right? I don't think that we will have plently bytecode optimizers in the wild. Even if two or three major bytecode optimizers become popular, are you sure that we will want to combine them? I expect that a single optimizer implements *all* optimizations. I don't see the point of running multiple optimizers to implement multiple optimizations steps. For example, my fatoptimizer AST optimizer implements multiple steps, but *internally*. It is only called once on the AST. I don't think that performance of importing modules really matters. My PEP 511 is mostly written for compilation ahead of time, to support complex and expensive optimizers. The real problem is to run a script: "python script.py" always has to execute all code transformers. For scripts, I hesitate to simply disable expensive optimizers, or maybe even disable all optimizers. For example, a script can run less than 50 ms, is it worth to spend 10 ms to optimize it to a get speedup of 1 ms? (no) The problem of using a specific format for bytecode rather than a code object is that we will have to maintain it. I'm not sure that all bytecode optimizer want the same internal structures. For some kind of optimizations, a sequential list of instructions is enough. For some other optimizations, you need to split blocks of code to have a representation of the exacty "control flow". I'm not sure that one structure is enough to cover all cases. So I prefer to let optimizers "disassemble" and "assemble" themself the bytecode. Last point, the PEP 511 has to take in account the existing peephole optimizer implement in C. If you really want to use a different structure, you will have to reimplement the peephole optimizer with your new API. Since my target is AST, I'm not really interested by that :-) What do you think?
I played around with a few possibilities from fixed-width bytecode and uncompressed lnotab to a public version of the internal assembler structs, but I think the best one is a flat sequence of instructions, with pseudo-instructions for labels and line numbers, and jump targets just references to those label instructions.
Could you try to reimplement the whole peephole optimizer to see if it benefit of your design? I played with bytecode in the past. At the send, I started to implement optimizations which can be implemented simpler at AST level. Why do you prefer bytecode over AST? Your example of converting globals to constants became trivial to implement using my new ast.Constant node.
* It needs a C API, and probably a C implementation.
I don't like extending the Python C API, it is already very large, we have too many functions :-p A C API is more expensive to maintain than a Python API. I would prefer to continue to play with an external module (hosted on PyPI) to not pay the price of maintenance! By the way, what is the final goal? Do you plan to implement a new ultra optimized bytecode optimizer? If yes, do you plan to integrate it into CPython? If no, I don't think that we have to pay the price of maintenance for such "toy" project. The design of my PEP 511 is to allow to support pluggable and *external* optimizers. I don't think that any code optimizer in the wild is mature enough to enter into CPython directly.
Anyway, I realize this API is still a little vague, (...)
It doesn't fit requirements to put something into the Python stdlib. Usually, we experiment stuff on PyPI, wait until it becomes mature, and then propose to integrate it. It looks like you are talking about creating a new API and directly put it into the stdlib, right? Are you sure that it will not change next 2 years? Not any single minor change?
We could just pass code objects (or all the separate pieces, instead of some of them), and then the docs could suggest using byteplay for non-trivial bytecode transformers, and then everyone will just end up using byteplay.
Again, you need to elaborate your rationale. What are your use cases? Which kind of optimizations do you want to implement?
So, what's wrong with that? The biggest problem is that, after each new Python release, anyone using a bytecode transformer will have to wait until byteplay is updated before they can update Python.
Why not contributing to byteplay to support the next CPython release? I don't understand your problem here.
Or we could just remove bytecode transformers from PEP 511. PEP 511 still seems worth doing to me, even if it only has AST transformers, especially since all or nearly all of the examples anyone's come up with for it are implementable (and easier to implement) at the AST level.
I want to plug the existing peephole optimizer into my PEP 511 since it is an obvious *code* transformer. Even if changes are minor, some users want to disable it because it really changes the code. It would help code coverage for example. Victor

On Feb 12, 2016, at 04:05, Victor Stinner <victor.stinner@gmail.com> wrote:
Very much so. The basic idea could be done without PEP 511, but there wouldn't be any urgency to doing it if the only client is the peephole optimizer...
OK. But if the peephole optimizer will almost always be the only bytecode processor present, with everything else being AST processors, then it already is special, and making it more generic in a way that still doesn't encompass anything else doesn't change that, it just adds complexity. Personally, I think there probably are useful optimizations that can only be done by bytecode. But the fact that nobody has come up with any compelling examples (while there are lots of compelling examples for AST processors) might imply that we should put it off until someone needs it.
You're either answering out of order, or you missed the point of this paragraph. Your API passes only the bytecode string, consts, lnotab, and global names. There is no way to write a bytecode processor that looks up locals with that interface, because you didn't pass the locals. As I said, that's easy to fix--we could pass all 12 or so necessary parameters, or wrap them up with the 5 or so unnecessary ones in a code object one step earlier in the process and pass that, etc. But that just gets us to the second problem, which is harder to fix.
No. As I said, the peephole optimizer deals with this by punting and returning the original code whenever things get remotely complicated (e.g., long functions). I don't think we want third-party optimizers to similarly punt on anything complicated. So, they _will_ have bugs like this, unless there's some way of making it tractable for them.
Well, the API currently doesn't even take a whole code object--but other than that, yes, that's exactly what I'm saying.
You seem to be focusing on the performance issue here, while I think that's a minor consideration at best. The problem (probably) isn't that it's too costly for the CPU to run the disassembly and assembly code multiple times, but that it's too costly for the Python community to write and maintain disassembly and assembly code multiple times. I suppose if you're expecting that everyone experimenting with a new optimizer will do so by writing it as a patch to one of the three major optimizer projects rather than a standalone project, that might still solve much of the real problem. But do you really think (a) that's the way things should go, and (b) that's the way things probably _will_ go?
I didn't think the performance of optimizers would matter _at all_, which is why I was most focused on the simplicity and DRY argument. If it really is an issue for startup time for small scripts, then repeating the same disassemble/assemble steps multiple times may make that issue worse, but I don't think we'd know that without profiling. So I still think this is probably the least important of my arguments for the idea.
The problem of using a specific format for bytecode rather than a code object is that we will have to maintain it.
We're already maintaining such a format in the stdlib today, and have been for years: the dis module. The problem is that it's an immutable format, and has only a disassembler but no assembler, and no way to pass code into the assembler we already have under the covers. My proposal is essentially just to fix that. Everything else is just gravy. For example, for very little extra work, we could make the mutable dis module replace _all_ need for byteplay, so I think we should do that very little extra work--but if we don't, the core idea is still useful on its own.
Even then, it's far easier to build a tree of blocks from the dis structure than from the bytecode-and-lnotab-and-arrays structure--and even more so to do the assembly and fixups if you're just assembling the dis structure rather than the bytecode-and-etc. structure. (And, if optimizer performance is really an issue--again, I doubt it is, but just in case--it should also be much less expensive to chain your third-party optimizer and the peephole optimizer if you don't need to do the fixups in between.)
Yes. But that's more part of the motivation than a problem. From my exploratory steps, rewriting the peephole optimizer around the dis API is less work than almost any nontrivial change to the peephole optimizer as-is (far less work than changing it to understand wordcode, which I was playing with last weekend). Also, I'm curious if there would be any measurable performance benefit from the peephole optimizer not punting on large functions. No huge benefits here, but as long as someone's willing to do the work--and I am willing, unless Serhiy (who definitely knows CPython better than me, and seems just as motivated here) wants to do it first.
Since my target is AST, I'm not really interested by that :-)
And that brings us back to the alternative idea of just not dealing with bytecode in PEP 511.
Yes, if I work on this over the weekend, reimplementing the peephole optimizer will definitely be part of the PoC. After all, we don't have a large code base of other bytecode optimizers used in the wild to play with yet. :)
I don't prefer bytecode over AST. As I've said multiple times, including in the same email you're responding to, I think 90% of the optimizations you could do in bytecode could instead be done, usually more simply (even with something like byteplay), at the AST level. The only question is whether that last 10% matters.
Imagine you hadn't thought of that optimization (not likely, since it's the first example everyone uses, but you know what I mean), and we'd released Python 3.6 with PEP 511 and without ast.Constant, and now I thought of it. How would I implement it in a third-party optimizer? I can't patch the compiler to add a new AST node. (Nobody's going to install my optimizer if the steps are "get the CPython source, apply this patch, rebuild with the same config as your existing compiler, ...") And there's no way to implement it with the AST with the existing nodes. So I'd have to write it as a bytecode transformer. Of course once I released that to PyPI and everyone loved it and we had benchmarks to back it up, then I could come back and say, "We should add an ast.Constant node in Python 3.7, because that would let me rewrite my global optimizer as a simpler AST processor." And a few years later Python 3.6 would stop being relevant enough to keep maintaining the bytecode version. Meanwhile, here's a completely different example: the Python compiler emits a lot of jumps to jump instructions. The peephole optimizer takes care of some, but not all, of these. Maybe with an ast.Goto node, or extra fields in the various for/try/etc. nodes that don't get generated by parse but can be added by an AST processor, etc., they could all be handled at the AST level--but I'm not sure about that, and I think it would be significantly more complicated than doing it at the bytecode level. So, it's still one of the 10% cases. Again, I think it's possible that the Python community could live without that 10%. If the globals idea had to wait a year and a half for a new AST node to be added, and the double-jump fixes in the existing peephole optimizer were all we ever had, I think CPython would do just fine. (In fact, I think the whole mania for micro-optimizing CPython is mostly misdirected energy in the first place... But I could be wrong about that, and if other people think it's useful, I think it's fun.:)) Which is why I think maybe removing bytecode processors from PEP 511 is a real alternative. It's only if we think we _do_ need bytecode processors that we need a way to write them.
Well, if PEP 511 only supports optimizers written in Python (or written in C, but using the Python API), then the only real need for the C API is the peephole optimizer. So (following my own advice about not overgeneralizing a special case, which I should have thought of before...), I think you're right here. Leave the dis API as pure Python, and keep anything that has to be done in C private to the compiler (maybe shared with the peephole optimizer if necessary, but not exposed as a public and documented C API). So, scratch that line--which makes the proposed change a lot smaller.
I would prefer to continue to play with an external module (hosted on PyPI) to not pay the price of maintenance!
The dis module is already in the stdlib. The byteplay module is a 8 years old in its current form, even older in its original form, and it's become abundantly clear that the only problem with maintaining byteplay is syncing up with changes in dis and the rest of CPython. (The dis module, on the other hand, actually _does_ change, far more often and more substantially than byteplay, and it fits into the CPython release schedule pretty nicely.)
I don't think it's a "toy" project any more than the existing dis module is. At any rate, the end goals are, in rapidly-descending order of importance: 1. Make it feasible for people to write bytecode transformers. 2. Encourage experimentation by allowing people to move mostly the same code between decorators and PEP 511 optimizers and even built in to CPython. 3. Make the compiler a little simpler, and the peephole optimizer a lot simpler. 4. Possibly improve the performance of the optimization step, and possibly improve the performance benefits of the peephole optimizer (by having it not punt on complicated functions). 5. Enable experimentation on low-level parts of CPython (e.g., the only hard part of the wordcode patch in 3.5 is the peephole optimizer--if that went away, we could have more experiments in that vein). 6. Remove the maintenance headaches for byteplay and modules that depend on it. As I said, the importance decreases rapidly, and if we decide that bytecode optimizers aren't that important, and leave them out of PEP 511, that pretty much knocks out the top 2, which I think is more than enough to kill my proposal.
I'm talking about making the smallest possible changes to the _existing_ dis API that's _already_ in the stdlib, and basing those changes on a very mature third-party library that's long since defeated all of its competition. Meanwhile, "just put it on PyPI" isn't an option. A third-party module could monkeypatch dis, but it can't change the order in which the builtin compiler does the assembly, optimizer, and fixup steps, or change the interface to PEP 511 processors and the peephole optimizer. As I said, there _is_ the alternative of making a smaller change (like handing complete code objects to processors and then recommending byteplay), which solves not all but some of the problems for less work, and there's also the alternative of just dropping the bytecode processor API, which means most of the problems don't need to be solved, for no work.
Any kind of optimizations that need to work on bytecode. My rationale for that is that PEP 511 has a rationale for such optimizations. If that rationale isn't good enough, PEP 511 should only handle AST processors, in which case most of the problem I'm trying to solve won't exist in the first place.
So, what's wrong with that? The biggest problem is that, after each new Python release, anyone using a bytecode transformer will have to wait until byteplay is updated before they can update Python.
Why not contributing to byteplay to support the next CPython release?
I've done that before. So have other people. The problem is that someone has to notice that something about CPython bytecode and/or the dis module has changed, and figure out how byteplay needs to change to handle that, and then figure out how to make the change backward-compatibly. If it were integrated in the dis module, then every change that anyone makes would automatically be handled, because you already have to update the dis module (which is almost always very simple--and would continue to be so). And it would be the person who best understands the change making the update, instead of someone who just discovered that 3.7 makes his code raise a SystemError if run through a bytecode processor that he's been using for the last two years. Essentially, integrating this into the existing dis module means we track changes for free, and it's hard to beat free.
I don't understand your problem here.
I think that's because you've never tried to write a non-trivial bytecode processor.
If it really is a special case that needs to be handled, don't over generalize it to cover other cases that you're never going to need. I think we might want those other cases. If so, we need to make them writable. If not, we shouldn't enable them half-way.

Hi, I understand that you have 3 major points: (1) byteplay lags behind CPython, it's difficult to maintain it (2) you want to integrate the code features of byteplay into the dis module (3) you want to use a new API of the dis module in the PEP 511 for bytecode transformers For the point (1), it may be fixed by the point (2). Otherwise, I'm not interested to add byteplay into CPython. I prefer to not promote too much the usage of bytecode transformers, since the bytecode is very low-level: it depends on the Python minor version and is not portable accross implementations of Python. We want to be free to modify deeply the bytecode in CPython. Yury explained that before than me ;-) IMHO the point (2) is just fine. Go ahead! I'm opposed to the point (3) because it would couple too much the exact implementation of bytecodes to the PEP 511 API. I tried to write an API which can be implemented by all Python implementations, not only CPython. See: https://www.python.org/dev/peps/pep-0511/#other-python-implementations IMHO a "black-box" bytecode transformer API is the best we can do to support all Python implementations. By the way, (3) requires to reimplement the dis module in C to bootstrap Python. IMHO it will be boring to write the C code, and much more annoying to maintain it. So quickly, we will refuse any enhancement on the API. I don't think that you want that, since you want to experiment new things, right? It is already to build the API you described *on top of the PEP 511*. Such API would probably be specific to CPython and to byteplay (or dis module if you enhance it). Example: --- import dis import sys # PEP 511 code transformer class ByteplayRegistry: name = "bytecode" def __init__(self): self._transformers = [] def register(self, name, transformer): # FIXME: update self.name? self._transformers.append(transformer) def disassemble(self, code): # FIXME: optional support for byteplay? return dis.dis(code) def assemble(self, asm, code): # FIXME: implement assembler return ... def code_transformer(self, code, context): # disassemble() and assemble() is only called once for all transformers asm = self.disassemble(code) for transformer in self._transformers: asm = transformer(asm, code, context) return self.assemble(asm, code) def global_to_const(asm, code, context): # FIXME: implement optimization return asm byteplay_registry = ByteplayRegistry() byteplay_registry.register("global_to_const", global_to_const) sys.set_code_transformers([byteplay_registry]) --- 2016-02-12 20:16 GMT+01:00 Andrew Barnert <abarnert@yahoo.com>:
Your API passes only the bytecode string, consts, lnotab, and global names. There is no way to write a bytecode processor that looks up locals with that interface, because you didn't pass the locals.
Oh, it looks like you are referring to an old version of the PEP 511 which passed 5 parameters to code_transformer(). The latest version now pass a whole code object to the transformer, and the transformer must return a new code object: https://www.python.org/dev/peps/pep-0511/#code-transformer-method I just saw that I forgot to update the example of bytecode transformer. It's now fixed. https://www.python.org/dev/peps/pep-0511/#bytecode-transformer I also updated the implementation: https://bugs.python.org/issue26145 Victor

On 2016-02-11 10:58 PM, Andrew Barnert via Python-ideas wrote:
tl;dr: We should turn dis.Bytecode into a builtin mutable structure similar to byteplay.Code, to make PEP 511 bytecode transformers implementable.
Big -1 on the idea, sorry. CPython's bytecode is the implementation detail of CPython. PyPy has some opcodes that CPython doesn't have, for example. Who knows, maybe in CPython 4.0 we won't have code objects at all :) Adding something to the standard library means that it will be supported for years to come. It means that the code is safe to use. Which, in turn, guarantees that there will be plenty of code that depends on this new functionality. At first some of that code will be bytecode optimizers, later someone implements LINQ-like extension, and in no time we lose our freedom to work with opcodes. If this "new functionality" is something that depends on CPython's internals, it will only fracture the ecosystem. PyPy, or Pyston, or IronPython developers will either have to support byteplay-like stuff (which might be impossible), or explain their users why some libraries don't work on their platform. Yury

On Feb 12, 2016, at 12:36, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This sounds like potentially a argument against adding bytecode processors in PEP 511.[^1] But if PEP 511 *does* add bytecode processors, I don't see how my proposal makes things any worse. Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason. In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors. [^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...

On 2016-02-12 4:13 PM, Andrew Barnert wrote:
The main (and only?) motivation behind PEP 511 is the optimization of CPython. Maybe the new APIs will only be exposed at C level.
Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason.
You don't need mutability for introspection.
In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors.
PEP 492 added a bunch of new opcodes. Serhiy is exploring an opportunity of adding few more LOAD_CONST_N opcodes. How would a mutable byteplay-code-like object in the dis module help that? Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough. FWIW I have lots of experience working with bytecodes. For example, I have code in production systems in which decorators patch functions' code objects: they guard 'yield' expressions in 'finally' blocks with some function calls. So far that code had only caused me maintenance pain. It's the only code that I have to upgrade each time a new version of Python is released.
[^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...
Import hooks (and even AST/parse/compile) is a much more high-level API. I'm not sure we can compare them to byteplay. Yury

On 2016-02-12 4:45 PM, Yury Selivanov wrote:
The key point here is not the API of your mutable code-object-abstraction. The problem is that bytecode modifiers rely on: 1. existence of certain bytecodes; 2. knowing their precise behaviour and side-effects; 3. matching/patching/analyzing exact sequences of bytecodes. High-level abstractions won't help with the above. Say we want to remove a few opcodes in favour of adding a few new ones. If we do that, most of code optimizers will break. That's why our low-level peephole optimizer is private - we can update it ourselves when we need it. It's completely in our control. Also, AFAIK, FAT Python analyzes/transforms AST. I'm not sure how byteplay could help FAT Python specifically. Yury

On Feb 12, 2016, at 13:57, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Also, AFAIK, FAT Python analyzes/transforms AST. I'm not sure how byteplay could help FAT Python specifically.
So far, all three people who've responded have acted like I invented the idea that PEP 511 might include bytecode optimizers. Even Victor, who wrote the PEP. Am I going crazy here? I'm looking right at http://www.python.org/dev/peps/pep-0511/. It has a "Usage 4" section that has a rationale for why we should allow writing bytecode optimizers in Python, and an example of something that can't be done by an AST optimizer. It has a "code_transformer() method" section, showing the API designed for writing those optimizers. It has an "API to get/set code transformers" section that explains when and how those transformers get run. It has a "Bytecode transformer" section that gives a toy example. Assuming I'm not imagining all that, why are people demanding that I provide a rationale for why we should add bytecode optimizers, or telling me that adding bytecode optimizers isn't going to help FAT Python, etc.? My proposal is that if we add bytecode optimizers, we should make it possible to write them. I don't need to justify that "if". If nobody things we should add bytecode optimizers, not even the author of the PEP that suggests them, then the answer is simple: just remove them from the PEP, and then my proposal becomes void.

On 12.02.2016 23:55, Ethan Furman wrote:
+1 from me. Despite all of Yury's concerns, I find that step logical and not overly problematic. Yury seems to be anxious about bytecode optimizers being used as one of the first-used features. However, without some proper docs this kind of feature is useless anyway. Thus, the docs just need to make is explicit what use-cases are supported (research, experimentation, no production, etc.). Furthermore, after you've gained experience with Python, you already know (aka feel) that you cannot solve most of your (production) problems by tweaking bytecode. I, moreover, don't agree with Yury about the number of potential optimizers. Sure, in the first years there will only be a few but long-term the Python ecosystem can only benefit from competition among bytecode optimizers. There are a lot of smart people out there who have some time to spare (as can be seen from the current dynamic debate about optimizing CPython). Best, Sven

On 2016-02-12 5:38 PM, Andrew Barnert wrote:
Assuming I'm not imagining all that, why are people demanding that I provide a rationale for why we should add bytecode optimizers, or telling me that adding bytecode optimizers isn't going to help FAT Python, etc.?
My proposal is that if we add bytecode optimizers, we should make it possible to write them. I don't need to justify that "if". If nobody things we should add bytecode optimizers, not even the author of the PEP that suggests them, then the answer is simple: just remove them from the PEP, and then my proposal becomes void.
Perhaps I and other people don't understand the "if we add bytecode optimizers, we should make it possible to write them" part. I don't expect to see more than 1 or 2 optimizers out there. Writing a bytecode optimizer that can yield significant speedups is an extremely challenging task. Adding a lot of new functionality to the stdlib *just* for those few optimizers doesn't make a lot of sense. Moreover, having such tools in the stdlib, might cause people to start using them for things other than optimization -- something I don't like at all. We already have import hooks, AST etc for most of such purposes. Yury

On Feb 12, 2016, at 13:45, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Have you read PEP 511? It exposes an API for adding bytecode processors, explicitly explains that the reason for this API is to allow people to write new bytecode optimizers in Python, and includes a toy example of a bytecode transformer. I'm not imagining some far-fetched idea that someone might suggest in the future, I'm responding to what's actually written in the PEP.
Having dis (and inspect, and types.CodeType, and so on) be part of the stdlib makes it easier, not harder, to change CPython without breaking code that may need to introspect it for some reason.
You don't need mutability for introspection.
Of course. When you split an analogy in half and only reply to the first half of it like this, the half-analogy has no content. So what?
In the same way, having a mutable dis would make it easier, not harder, to change CPython without breaking bytecode processors.
PEP 492 added a bunch of new opcodes. Serhiy is exploring an opportunity of adding few more LOAD_CONST_N opcodes. How would a mutable byteplay-code-like object in the dis module help that?
If someone had written a bytecode processor on top of the dis module, and wanted to update it to take advantage of LOAD_CONST_N, it would be easy to do so--even on a local copy of CPython patched with Serhiy's changes. If they'd instead written it on top of a third-party module, they'd have to wait for that module to be updated (probably after the next major version of Python comes out), or update it locally. Which one of those sounds easiest to you?
Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough.
That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?
[^1]: Then again, it's just as good an argument against import hooks, and exposing the __code__ member on function objects so decorators can change it, and so on, and years with those features hasn't created a catastrophe...
Import hooks (and even AST/parse/compile) is a much more high-level API. I'm not sure we can compare them to byteplay.
You're responding selectively here. Your argument is that people shouldn't mess with bytecode. If we don't want people to mess with bytecode, we shouldn't expose bytecode to be messed with. But you can write a decorator that sets f.__code__ = types.CodeType(...) with a replaced bytecode string, and all of the details on how to do that are fully documented in the dis and inspect modules. Making it tedious and error-prone is not a good way to discourage something. Meanwhile, the "low-level" part of this already exists: the dis module lists all the opcodes, disassembles bytecode, represents that disassembled form, etc.

On 2016-02-12 5:27 PM, Andrew Barnert wrote:
I guess I read an earlier version which was focused only on AST transformations. Maybe PEP 511 should be focused just on just one thing.
Sorry, somehow I failed to read that paragraph in one piece. My bad.
My point (which is the *key* point) is that if we decide to have only LOAD_CONST_N opcodes and remove plain old LOAD_CONST -- all optimizers will break, no matter what library they use. That's just a sad reality of working on the bytecode level. For instance, PEP 492 split WITH_CLEANUP opcode into WITH_CLEANUP_START and WITH_CLEANUP_FINISH. *Any* bytecode manipulation code that expected to see WITH_CLEANUP after SETUP_WITH *was* broken. In short: I don't want to add more stuff to CPython that can make it harder for us to modify its low-level internals.
Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough. That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?
function.__code__ exists and mutable regardless of PEP 511 and byteplay :) Let's not add it to the mix. You're right, I guess this is a common argument for both PEP511's code_transformer and a byteplay in stdlib.
Although function.__code__ is mutable, almost nobody actually mutates it. We have dis module primarily for introspection and research purposes, view it as a handy tool to see how CPython really works. I'm OK if PEP 511 adds some AST transformation hooks (because AST is a higher-level abstraction). Adding code-object transformation hooks and a library to mutate (or produce new) code objects seems very wrong to me. Yury

On 2016-02-12 5:27 PM, Andrew Barnert wrote:
I guess I read an earlier version which was focused only on AST transformations. Maybe PEP 511 should be focused just on just one thing.
Sorry, somehow I failed to read that paragraph in one piece. My bad.
My point (which is the *key* point) is that if we decide to have only LOAD_CONST_N opcodes and remove plain old LOAD_CONST -- all optimizers will break, no matter what library they use. That's just a sad reality of working on the bytecode level. For instance, PEP 492 split WITH_CLEANUP opcode into WITH_CLEANUP_START and WITH_CLEANUP_FINISH. *Any* bytecode manipulation code that expected to see WITH_CLEANUP after SETUP_WITH *was* broken. In short: I don't want to add more stuff to CPython that can make it harder for us to modify its low-level internals.
Interacting with bytecode in Python is generally considered unsafe, and used mostly for the purposes of experimentation, for which a PyPI module is enough. That's an argument against the PEP 511 API for adding bytecode processors--and, again, also possibly an argument against mutable function.__code__ and so on. But how is it an argument against my proposal?
function.__code__ exists and mutable regardless of PEP 511 and byteplay :) Let's not add it to the mix. You're right, I guess this is a common argument for both PEP511's code_transformer and a byteplay in stdlib.
Although function.__code__ is mutable, almost nobody actually mutates it. We have dis module primarily for introspection and research purposes, view it as a handy tool to see how CPython really works. I'm OK if PEP 511 adds some AST transformation hooks (because AST is a higher-level abstraction). Adding code-object transformation hooks and a library to mutate (or produce new) code objects seems very wrong to me. Yury
participants (6)
-
Andrew Barnert
-
Ethan Furman
-
Petr Viktorin
-
Sven R. Kunze
-
Victor Stinner
-
Yury Selivanov