[Python-Dev] proto-pep: How to change Python's bytecode
bac at ocf.berkeley.edu
Fri Dec 24 21:43:19 CET 2004
After implementing over 10 new opcodes for my thesis I figured I should write
down the basic steps in an info PEP so that there is enough guidelines with
this PEP and PEP 306 to cover the bases on changes to the language itself.
To go along with this I also plan to write some benchmarks for individual
opcodes that could possibly lead to a testing suite for the opcodes themselves
(will probably do this piece-meal and put it up on SF initially since there are
a lot of opcodes).
Anyway, let me know if I seem to be missing anything or have something to add.
After a reasonable time of non-response to this I will request a PEP number
(assuming people don't think this PEP is stupid).
Title: How to change Python's bytecode
Version: $Revision: 1.4 $
Last-Modified: $Date: 2003/09/22 04:51:50 $
Author: Brett Cannoon <brett at python.org>
Python source code is compiled down to something called bytecode. This
bytecode (which can be viewed as sequences of opcodes) defines what Python is
capable of. As such, knowing how to add, remove, or change the bytecode is
important to do properly when changing the abilities of the Python language.
While changing Python's bytecode is not a frequent occurence, it still happens.
Having the required steps documented in a single location should make
experimentation with the bytecode easier since it is not necessarily obvious
what the steps are to change the bytecode.
This PEP, paired with PEP 306 [#PEP-306]_, should provide enough basic
guidelines for handling any changes performed to the Python language itself in
terms of syntactic changes that introduce new semantics.
This is a rough checklist of what files need to change and how they are
involved with the bytecode. All paths are given from the viewpoint of
``/cvsroot/python/dist/src`` from CVS). This list should not be considered
exhaustive nor to cover all possible situations.
This include file lists all known opcodes and associates each opcode
a unique number. When adding a new opcode it is important to take note
of the ``HAVE_ARGUMENT`` value. This ``#define``'s value specifies the
value at which all opcodes that have a value greater than
``HAVE_ARGUMENT`` are expected to take an argument to the opcode.
Lists all of the opcodes and their associated value. Used by the dis
module [#dis]_ to map bytecode values to their names.
Contains the main interpreter loop. Code to handle the evalution of an
To make sure an opcode is actually used, this file must be altered.
The emitting of all bytecode occurs here.
- ``Lib/compiler/pyassem.py``, ``Lib/compiler/pycodegen.py``
The 'compiler' package [#compiler]_ needs to be altered to also reflect
any changes to the bytecode.
The documentation [#dis-docs] for the dis module contains a complete
list of all the opcodes.
Defines the magic word (named ``MAGIC``) used in .pyc files to detect if
the bytecode used matches the one used by the version of Python running.
This number needs to be changed to make sure that the running
interpreter does not try to execute bytecode that it does not know
Suggestions for bytecode development
A few things can be done to make sure that development goes smoothly when
experimenting with Python's bytecode. One is to delete all .py(c|o|w) files
after each semantic change to Python/compile.c . That way all files will use
any bytecode changes.
Make sure to run the entire testing suite [#test-suite]_. Since the
``regrtest.py`` driver recompiles all source code before a test is run it acts
a good test to make sure that no existing semantics are broken.
Running parrotbench [#parrotbench]_ is also a good way to make sure existing
semantics are not broken; this benchmark is practically a compliance test.
.. [#PEP-306] PEP 306, How to Change Python's Grammar, Hudson
.. [#dis] XXX
.. [#test-suite] XXX
.. [#parrotbench] XXX
.. [#dis-docs] XXX
This document has been placed in the public domain.
More information about the Python-Dev