PEP 511: API for code transformers

Hi, This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API to implement a static Python optimizer specializing functions with guards. If the PEP is accepted, it will solve a long list of issues, some issues are old, like #1346238 which is 11 years old ;-) I found 12 issues: * http://bugs.python.org/issue1346238 * http://bugs.python.org/issue2181 * http://bugs.python.org/issue2499 * http://bugs.python.org/issue2506 * http://bugs.python.org/issue4264 * http://bugs.python.org/issue7682 * http://bugs.python.org/issue10399 * http://bugs.python.org/issue11549 * http://bugs.python.org/issue17068 * http://bugs.python.org/issue17430 * http://bugs.python.org/issue17515 * http://bugs.python.org/issue26107 I worked to make the PEP more generic that "this hook is written for FAT Python". Please read the full PEP to see a long list of existing usages in Python of code transformers. You may read again the discussion which occurred 4 years ago about the same topic: https://mail.python.org/pipermail/python-dev/2012-August/121286.html (the thread starts with an idea of AST optimizer, but is moves quickly to a generic API to transform the code) Thanks to Red Hat for giving me time to experiment on this. Victor HTML version: https://www.python.org/dev/peps/pep-0510/#changes PEP: 511 Title: API for code transformers Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Propose an API to register bytecode and AST transformers. Add also ``-o OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o noopt`` disables the peephole optimizer. Raise an ``ImportError`` exception on import if the ``.pyc`` file is missing and the code transformers required to transform the code are missing. code transformers are not needed code transformed ahead of time (loaded from ``.pyc`` files). Rationale ========= Python does not provide a standard way to transform the code. Projects transforming the code use various hooks. The MacroPy project uses an import hook: it adds its own module finder in ``sys.meta_path`` to hook its AST transformer. Another option is to monkey-patch the builtin ``compile()`` function. There are even more options to hook a code transformer. Python 3.4 added a ``compile_source()`` method to ``importlib.abc.SourceLoader``. But code transformation is wider than just importing modules, see described use cases below. Writing an optimizer or a preprocessor is out of the scope of this PEP. Usage 1: AST optimizer ---------------------- Transforming an Abstract Syntax Tree (AST) is a convenient way to implement an optimizer. It's easier to work on the AST than working on the bytecode, AST contains more information and is more high level. Since the optimization can done ahead of time, complex but slow optimizations can be implemented. Example of optimizations which can be implemented with an AST optimizer: * `Copy propagation <https://en.wikipedia.org/wiki/Copy_propagation>`_: replace ``x=1; y=x`` with ``x=1; y=1`` * `Constant folding <https://en.wikipedia.org/wiki/Constant_folding>`_: replace ``1+1`` with ``2`` * `Dead code elimination <https://en.wikipedia.org/wiki/Dead_code_elimination>`_ Using guards (see the `PEP 510 <https://www.python.org/dev/peps/pep-0510/>`_), it is possible to implement a much wider choice of optimizations. Examples: * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used as iterable * `Loop unrolling <https://en.wikipedia.org/wiki/Loop_unrolling>`_ * Call pure builtins: replace ``len("abc")`` with ``3`` * Copy used builtin symbols to constants * See also `optimizations implemented in fatoptimizer <https://fatoptimizer.readthedocs.org/en/latest/optimizations.html>`_, a static optimizer for Python 3.6. The following issues can be implemented with an AST optimizer: * `Issue #1346238 <https://bugs.python.org/issue1346238>`_: A constant folding optimization pass for the AST * `Issue #2181 <http://bugs.python.org/issue2181>`_: optimize out local variables at end of function * `Issue #2499 <http://bugs.python.org/issue2499>`_: Fold unary + and not on constants * `Issue #4264 <http://bugs.python.org/issue4264>`_: Patch: optimize code to use LIST_APPEND instead of calling list.append * `Issue #7682 <http://bugs.python.org/issue7682>`_: Optimisation of if with constant expression * `Issue #10399 <https://bugs.python.org/issue10399>`_: AST Optimization: inlining of function calls * `Issue #11549 <http://bugs.python.org/issue11549>`_: Build-out an AST optimizer, moving some functionality out of the peephole optimizer * `Issue #17068 <http://bugs.python.org/issue17068>`_: peephole optimization for constant strings * `Issue #17430 <http://bugs.python.org/issue17430>`_: missed peephole optimization Usage 2: Preprocessor --------------------- A preprocessor can be easily implemented with an AST transformer. A preprocessor has various and different usages. Some examples: * Remove debug code like assertions and logs to make the code faster to run it for production. * `Tail-call Optimization <https://en.wikipedia.org/wiki/Tail_call>`_ * Add profiling code * `Lazy evaluation <https://en.wikipedia.org/wiki/Lazy_evaluation>`_: see `lazy_python <https://github.com/llllllllll/lazy_python>`_ (bytecode transformer) and `lazy macro of MacroPy <https://github.com/lihaoyi/macropy#lazy>`_ (AST transformer) * Change dictionary literals into collection.OrderedDict instances * Declare constants: see `@asconstants of codetransformer <https://pypi.python.org/pypi/codetransformer>`_ * Domain Specific Language (DSL) like SQL queries. The Python language itself doesn't need to be modified. Previous attempts to implement DSL for SQL like `PEP 335 - Overloadable Boolean Operators <https://www.python.org/dev/peps/pep-0335/>`_ was rejected. * Pattern Matching of functional languages * String Interpolation, but `PEP 498 -- Literal String Interpolation <https://www.python.org/dev/peps/pep-0498/>`_ was merged into Python 3.6. `MacroPy <https://github.com/lihaoyi/macropy>`_ has a long list of examples and use cases. This PEP does not add any new code transformer. Using a code transformer will require an external module and to register it manually. See also `PyXfuscator <https://bitbucket.org/namn/pyxfuscator>`_: Python obfuscator, deobfuscator, and user-assisted decompiler. Usage 3: Disable all optimization --------------------------------- Ned Batchelder asked to add an option to disable the peephole optimizer because it makes code coverage more difficult to implement. See the discussion on the python-ideas mailing list: `Disable all peephole optimizations <https://mail.python.org/pipermail/python-ideas/2014-May/027893.html>`_. This PEP adds a new ``-o noopt`` command line option to disable the peephole optimizer. In Python, it's as easy as:: sys.set_code_transformers([]) It will fix the `Issue #2506 <https://bugs.python.org/issue2506>`_: Add mechanism to disable optimizations. Usage 4: Write new bytecode optimizers in Python ------------------------------------------------ Python 3.6 optimizes the code using a peephole optimizer. By definition, a peephole optimizer has a narrow view of the code and so can only implement basic optimizations. The optimizer rewrites the bytecode. It is difficult to enhance it, because it written in C. With this PEP, it becomes possible to implement a new bytecode optimizer in pure Python and experiment new optimizations. Some optimizations are easier to implement on the AST like constant folding, but optimizations on the bytecode are still useful. For example, when the AST is compiled to bytecode, useless jumps can be emited because the compiler is naive and does not try to optimize anything. Use Cases ========= This section give examples of use cases explaining when and how code transformers will be used. Interactive interpreter ----------------------- It will be possible to use code transformers with the interactive interpreter which is popular in Python and commonly used to demonstrate Python. The code is transformed at runtime and so the interpreter can be slower when expensive code transformers are used. Build a transformed package --------------------------- It will be possible to build a package of the transformed code. A transformer can have a configuration. The configuration is not stored in the package. All ``.pyc`` files of the package must be transformed with the same code transformers and the same transformers configuration. It is possible to build different ``.pyc`` files using different optimizer tags. Example: ``fat`` for the default configuration and ``fat_inline`` for a different configuration with function inlining enabled. A package can contain ``.pyc`` files with different optimizer tags. Install a package containing transformed .pyc files --------------------------------------------------- It will be possible to install a package which contains transformed ``.pyc`` files. All ``.pyc`` files with any optimizer tag contained in the package are installed, not only for the current optimizer tag. Build .pyc files when installing a package ------------------------------------------ If a package does not contain any ``.pyc`` files of the current optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are created during the installation. Code transformers of the optimizer tag are required. Otherwise, the installation fails with an error. Execute transformed code ------------------------ It will be possible to execute transformed code. Raise an ``ImportError`` exception on import if the ``.pyc`` file of the current optimizer tag is missing and the code transformers required to transform the code are missing. The interesting point here is that code transformers are not needed to execute the transformed code if all required ``.pyc`` files are already available. Code transformer API ==================== A code transformer is a class with ``ast_transformer()`` and/or ``code_transformer()`` methods (API described below) and a ``name`` attribute. For efficiency, do not define a ``code_transformer()`` or ``ast_transformer()`` method if it does nothing. The ``name`` attribute (``str``) must be a short string used to identify an optimizer. It is used to build a ``.pyc`` filename. The name must not contain dots (``'.'``), dashes (``'-'``) or directory separators: dots are used to separated fields in a ``.pyc`` filename and dashes areused to join code transformer names to build the optimizer tag. .. note:: It would be nice to pass the fully qualified name of a module in the *context* when an AST transformer is used to transform a module on import, but it looks like the information is not available in ``PyParser_ASTFromStringObject()``. code_transformer() ------------------ Prototype:: def code_transformer(code, consts, names, lnotab, context): ... return (code, consts, names, lnotab) Parameters: * *code*: the bytecode (``bytes``) * *consts*: a sequence of constants * *names*: tuple of variable names * *lnotab*: table mapping instruction offsets to line numbers (``bytes``) The code transformer is run after the compilation to bytecode ast_transformer() ------------------ Prototype:: def ast_transformer(tree, context): ... return tree Parameters: * *tree*: an AST tree * *context*: an object with a ``filename`` attribute (``str``) It must return an AST tree. It can modify the AST tree in place, or create a new AST tree. The AST transformer is called after the creation of the AST by the parser and before the compilation to bytecode. New attributes may be added to *context* in the future. Changes ======= In short, add: * ``-o OPTIM_TAG`` command line option * ``ast.Constant`` * ``ast.PyCF_TRANSFORMED_AST`` * ``sys.get_code_transformers()`` * ``sys.implementation.optim_tag`` * ``sys.set_code_transformers(transformers)`` API to get/set code transformers -------------------------------- Add new functions to register code transformers: * ``sys.set_code_transformers(transformers)``: set the list of code transformers and update ``sys.implementation.optim_tag`` * ``sys.get_code_transformers()``: get the list of code transformers. The order of code transformers matter. Running transformer A and then transformer B can give a different output than running transformer B an then transformer A. Example to prepend a new code transformer:: transformers = sys.get_code_transformers() transformers.insert(0, new_cool_transformer) sys.set_code_transformers(transformers) All AST tranformers are run sequentially (ex: the second transformer gets the input of the first transformer), and then all bytecode transformers are run sequentially. Optimizer tag ------------- Changes: * Add ``sys.implementation.optim_tag`` (``str``): optimization tag. The default optimization tag is ``'opt'``. * Add a new ``-o OPTIM_TAG`` command line option to set ``sys.implementation.optim_tag``. Changes on ``importlib``: * ``importlib`` uses ``sys.implementation.optim_tag`` to build the ``.pyc`` filename to importing modules, instead of always using ``opt``. Remove also the special case for the optimizer level ``0`` with the default optimizer tag ``'opt'`` to simplify the code. * When loading a module, if the ``.pyc`` file is missing but the ``.py`` is available, the ``.py`` is only used if code optimizers have the same optimizer tag than the current tag, otherwise an ``ImportError`` exception is raised. Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can be compiled to import a module:: def transformers_tag(): transformers = sys.get_code_transformers() if not transformers: return 'noopt' return '-'.join(transformer.name for transformer in transformers) def use_py(): return (transformers_tag() == sys.implementation.optim_tag) The order of ``sys.get_code_transformers()`` matter. For example, the ``fat`` transformer followed by the ``pythran`` transformer gives the optimizer tag ``fat-pythran``. The behaviour of the ``importlib`` module is unchanged with the default optimizer tag (``'opt'``). Peephole optimizer ------------------ By default, ``sys.implementation.optim_tag`` is ``opt`` and ``sys.get_code_transformers()`` returns a list of one code transformer: the peephole optimizer (optimize the bytecode). Use ``-o noopt`` to disable the peephole optimizer. In this case, the optimizer tag is ``noopt`` and no code transformer is registered. Using the ``-o opt`` option has not effect. AST enhancements ---------------- Enhancements to simplify the implementation of AST transformers: * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the transformed AST. ``PyCF_ONLY_AST`` returns the AST before the transformers. * Add ``ast.Constant``: this type is not emited by the compiler, but can be used in an AST transformer to simplify the code. It does not contain line number and column offset informations on tuple or frozenset items. * ``PyCodeObject.co_lnotab``: line number delta becomes signed to support moving instructions (note: need to modify MAGIC_NUMBER in importlib). Implemented in the `issue #26107 <https://bugs.python.org/issue26107>`_ * Enhance the bytecode compiler to support ``tuple`` and ``frozenset`` constants. Currently, ``tuple`` and ``frozenset`` constants are created by the peephole transformer, after the bytecode compilation. * ``marshal`` module: fix serialization of the empty frozenset singleton * update ``Tools/parser/unparse.py`` to support the new ``ast.Constant`` node type Examples ======== .pyc filenames -------------- Example of ``.pyc`` filenames of the ``os`` module. With the default optimizer tag ``'opt'``: =========================== ================== .pyc filename Optimization level =========================== ================== ``os.cpython-36.opt-0.pyc`` 0 ``os.cpython-36.opt-1.pyc`` 1 ``os.cpython-36.opt-2.pyc`` 2 =========================== ================== With the ``'fat'`` optimizer tag: =========================== ================== .pyc filename Optimization level =========================== ================== ``os.cpython-36.fat-0.pyc`` 0 ``os.cpython-36.fat-1.pyc`` 1 ``os.cpython-36.fat-2.pyc`` 2 =========================== ================== Bytecode transformer -------------------- Scary bytecode transformer replacing all strings with ``"Ni! Ni! Ni!"``:: import sys class BytecodeTransformer: name = "knights_who_say_ni" def code_transformer(self, code, consts, names, lnotab, context): consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const for const in consts] return (code, consts, names, lnotab) # replace existing code transformers with the new bytecode transformer sys.set_code_transformers([BytecodeTransformer()]) # execute code which will be transformed by code_transformer() exec("print('Hello World!')") Output:: Ni! Ni! Ni! AST transformer --------------- Similary to the bytecode transformer example, the AST transformer also replaces all strings with ``"Ni! Ni! Ni!"``:: import ast import sys class KnightsWhoSayNi(ast.NodeTransformer): def visit_Str(self, node): node.s = 'Ni! Ni! Ni!' return node class ASTTransformer: name = "knights_who_say_ni" def __init__(self): self.transformer = KnightsWhoSayNi() def ast_transformer(self, tree, context): self.transformer.visit(tree) return tree # replace existing code transformers with the new AST transformer sys.set_code_transformers([ASTTransformer()]) # execute code which will be transformed by ast_transformer() exec("print('Hello World!')") Output:: Ni! Ni! Ni! Other Python implementations ============================ The PEP 511 should be implemented by all Python implementation, but the bytecode and the AST are not standardized. By the way, even between minor version of CPython, there are changes on the AST API. There are differences, but only minor differences. It is quite easy to write an AST transformer which works on Python 2.7 and Python 3.5 for example. Discussion ========== * `[Python-Dev] AST optimizer implemented in Python <https://mail.python.org/pipermail/python-dev/2012-August/121286.html>`_ (August 2012) Prior Art ========= AST optimizers -------------- In 2011, Eugene Toder proposed to rewrite some peephole optimizations in a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving some functionality out of the peephole optimizer <https://bugs.python.org/issue11549>`_. The patch adds ``ast.Lit`` (it was proposed to rename it to ``ast.Literal``). In 2012, Victor Stinner wrote the `astoptimizer <https://bitbucket.org/haypo/astoptimizer/>`_ project, an AST optimizer implementing various optimizations. Most interesting optimizations break the Python semantics since no guard is used to disable optimization if something changes. In 2015, Victor Stinner wrote the `fatoptimizer <http://fatoptimizer.readthedocs.org/>`_ project, an AST optimizer specializing functions using guards. The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST" optimizer <https://bugs.python.org/issue17515>`_ was a first attempt of API for code transformers, but specific to AST. Python Preprocessors -------------------- * `MacroPy <https://github.com/lihaoyi/macropy>`_: MacroPy is an implementation of Syntactic Macros in the Python Programming Language. MacroPy provides a mechanism for user-defined functions (macros) to perform transformations on the abstract syntax tree (AST) of a Python program at import time. * `pypreprocessor <https://code.google.com/p/pypreprocessor/>`_: C-style preprocessor directives in Python, like ``#define`` and ``#ifdef`` Bytecode transformers --------------------- * `codetransformer <https://pypi.python.org/pypi/codetransformer>`_: Bytecode transformers for CPython inspired by the ``ast`` module’s ``NodeTransformer``. * `byteplay <http://code.google.com/p/byteplay/>`_: Byteplay lets you convert Python code objects into equivalent objects which are easy to play with, and lets you convert those objects back into living Python code objects. It's useful for applying crazy transformations on Python functions, and is also useful in learning Python byte code intricacies. See `byteplay documentation <http://wiki.python.org/moin/ByteplayDoc>`_. See also: * `BytecodeAssembler <http://pypi.python.org/pypi/BytecodeAssembler>`_ Copyright ========= This document has been placed in the public domain.

You linked to PEP 510 #changes. I think you wanted https://www.python.org/dev/peps/pep-0511/ Sent from my iPhone

I have a fully working implementation of the PEP 509, 510 and 511 (all together). You can install it to play with it if you want ;-) Get and compile patched (FAT) Python with: -------------- hg clone http://hg.python.org/sandbox/fatpython cd fatpython ./configure && make -------------- Enjoy slow and non optimized bytecode :-) ------------- $ ./python -o noopt -c 'import dis; dis.dis(compile("1+1", "test", "exec"))' 1 0 LOAD_CONST 0 (1) 3 LOAD_CONST 0 (1) 6 BINARY_ADD 7 POP_TOP 8 LOAD_CONST 1 (None) 11 RETURN_VALUE ------------- Ok, now if you want to play with fat & fatoptimizer modules (FAT Python): -------------- ./python -m venv ENV cd ENV git clone https://github.com/haypo/fat.git git clone https://github.com/haypo/fatoptimizer.git (cd fat; ../bin/python setup.py install) (cd fatoptimizer; ../bin/python setup.py install) cd .. -------------- I'm not using virtual environment for my development, I prefer to copy manually fatoptimizer/fatoptimizer/ directory and the build .so file of the fat module into the Lib/ directory of the standard library. If you installed the patched Python into /opt/fatpython (./confgure --prefix=/opt/fatpython && make && sudo make install), you can also use "python setup.py install" in fat/ and fatoptimizer/ to install them easily. The drawback of the virtualenv is that it's easy to use the wrong python (./python vs ENV/bin/python) and don't have FAT Python enabled because of http://bugs.python.org/issue26099 which ignores silently import errors in sitecustomize... Ensure that FAT Python is enabled with: -------- $ ./python -X fat -c 'import sys; print(sys.implementation.optim_tag)' fat-opt -------- You must get "fat-opt" (and not "opt"). Note: The optimizer tag is "fat-opt" and not "fat" because fatoptimizer keeps the peephole optimizer. Enable FAT Python using the "-X fat" command line option: -------------- $ ENV/bin/python -X fat
def func(): return len("abc") ...
Play with microbenchmarks: --------------- $ ENV/bin/python -m timeit -s 'def f(): return len("abc")' 'f()' 10000000 loops, best of 3: 0.122 usec per loop $ ENV/bin/python -X fat -m timeit -s 'def f(): return len("abc")' 'f()' 10000000 loops, best of 3: 0.0932 usec per loop --------------- Oh look! It's faster without having to touch the code ;-) I'm using Lib/sitecustomize.py to register the optimizer if -X fat is used: ------- import sys if sys._xoptions.get('fat'): import fatoptimizer; fatoptimizer._register() ------- If you want to run optimized code without registering the optimizer, it doesn't work because .pyc are missing: --- $ ENV/bin/python -o fat-opt Fatal Python error: Py_Initialize: Unable to get the locale encoding ImportError: missing AST transformers for '.../Lib/encodings/__init__.py': optim_tag='fat', transformers tag='noopt' --- You have to compile optimized .pyc files: --- # the optimizer is slow, so add -v to enable fatoptimizer logs for more fun ENV/bin/python -X fat -v -m compileall # why does compileall not compile encodings/*.py? ENV/bin/python -X fat -m py_compile /home/haypo/prog/python/fatpython/Lib/encodings/{__init__,aliases,latin_1,utf_8}.py --- Finally, enjoy optimized code with no registered optimized: --- # hum, use maybe ENV/bin/activate instead of my magic tricks $ export PYTHONPATH=ENV/lib/python3.6/site-packages/ $ ENV/bin/python -o fat-opt -c 'import sys; print(sys.implementation.optim_tag, sys.get_code_transformers())' fat-opt [] --- Remember that you cannot import .py files in this case, only .pyc: --- $ touch x.py $ ENV/bin/python -o fat-opt -c 'import x' Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: missing AST transformers for '.../x.py': optim_tag='fat-opt', transformers tag='noopt' --- Victor

On Fri, 15 Jan 2016 at 08:11 Victor Stinner <victor.stinner@gmail.com> wrote:
[SNIP]
I just wanted to point out to people that the key part of this PEP is the change in semantics of `-O` accepting an argument. Without this change there is no way to cause import to pick up on optimized .pyc files that you want it to use without abusing pre-existing .pyc filenames. This also means that everything else is optional. That doesn't mean it shouldn't be considered, mind you, as it makes using AST and bytecode transformers more practical. But some `-O` change that allows user-defined optimization tags is needed for any of this to work reasonably. From there it's theoretically possible for someone to write their own compileall that pre-compiles all Python code to .pyc files with a specific optimization tag which they specify with `-O` using their own AST and bytecode transformers and hence not need the transformation features built into sys/import. I should also point out that this does get tricky in terms of how to handle the stdlib if you have not pre-compiled it, e.g., if the first module imported by Python is the encodings module then how to make sure the AST optimizers are ready to go by the time that import happens? And lastly, Victor proposes that all .pyc files get an optimization tag. While there is nothing technically wrong with that, PEP 488 <https://www.python.org/dev/peps/pep-0488/> purposefully didn't do that in the default case for backwards-compatibility, so that will need to be at least mentioned in the PEP.

2016-01-15 18:22 GMT+01:00 Brett Cannon <brett@python.org>:
I just wanted to point out to people that the key part of this PEP is the change in semantics of `-O` accepting an argument.
The be exact, it's a new "-o arg" option, it's different from -O and -OO (uppercase). Since I don't know what to do with -O and -OO, I simply kept them :-D
Since importlib reads sys.implementation.optim_tag at each import, it works fine. For example, you start with "opt" optimizer tag. You import everything needed for fatoptimizer. Then calling sys.set_code_transformers() will set a new optimizer flag (ex: "fat-opt"). But it works since the required code transformers are now available. The tricky part is more when you want to deploy an application without the code transformer, you have to ensure that all .py files are compiled to .pyc. But there is no technical issues to compile them, it's more a practical issue. See my second email with a lot of commands, I showed how .pyc are created with different .pyc filenames. Or follow my commands to try my "fatpython" fork to play yourself with the code ;-)
The PEP already contains: https://www.python.org/dev/peps/pep-0511/#optimizer-tag "Remove also the special case for the optimizer level 0 with the default optimizer tag 'opt' to simplify the code." Code relying on the exact .pyc filename (like unit tests) already have to be modified to use the optimizer tag. It's just an opportunity to simplify the code. I don't really care of this specific change ;-) Victor

On Fri, 15 Jan 2016 at 09:40 Victor Stinner <victor.stinner@gmail.com> wrote:
I understand all of that; my point is what if you don't compile the stdlib for your optimization? You have to import over 20 modules before user code gets imported. My question is how do you expect the situation to be handled where you didn't optimize the stdlib since the 'encodings' module is imported before anything else? If you set your `-o` flag and you want to fail imports if the .pyc isn't there, then wouldn't that mean you are going to fail immediately when you try and import 'encodings' in Py_Initialize()?
Right, it's just mentioning the backwards-compatibility issue should be there. -Brett

On 17 January 2016 at 04:28, Brett Cannon <brett@python.org> wrote:
I don't think that's a major problem - it seems to me that it's the same as going for "pyc only" deployment with an embedded Python interpreter, and then forgetting to a precompiled standard library in addition to your own components. Yes, it's going to fail, but the bug is in the build process for your deployment artifacts rather than in the runtime behaviour of CPython. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, 16 Jan 2016 at 19:38 Nick Coghlan <ncoghlan@gmail.com> wrote:
It is the same, and that's my point. If we are going to enforce this import requirement of having a matching .pyc file in order to do a proper import, then we are already requiring an offline compilation which makes the dynamic registering of optimizers a lot less necessary. Now if we tweak the proposed semantics of `-o` to say "import these of kind of optimized .pyc file *if you can*, otherwise don't worry about it", then having registered optimizers makes more sense as that gets around the bootstrap problem with the stdlib. This would require optimizations to be module-level and not application-level, though. This also makes the difference between `-o` and `-O` even more prevalent as the latter is not only required, but then restricted to only optimizations which affect what syntax is executed instead of what AST transformations were applied. This also means that the file name of the .pyc files should keep `opt-1`, etc. and the AST transformation names get appended on as it would stack `-O` and `-o`. It really depends on what kinds of optimizations we expect people to do. If we expect application-level optimizations then we need to enforce universal importing of bytecode because it may make assumptions about other modules. But if we limit it to module-level optimizations then it isn't quite so critical that the .pyc files pre-exist, making it such that `-o` can be more of a request than a requirement for importing modules of a certain optimization. That also means if the same AST optimizers are not installed it's no big deal since you just work with what you have (although you could set it to raise an ImportWarning when an import didn't find a .pyc file of the requested optimization *and* the needed AST optimizers weren't available either).

On Jan 15, 2016, at 10:10, Victor Stinner <victor.stinner@gmail.com> wrote:
Some thoughts (and I realize that for many of these the answer will just be "that's out of scope for this PEP"): * You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers. * Why are transformers objects with ast_transformer and code_transformer methods, but those methods don't take self? (Are they automatically static methods, like __new__?) It seems like the only advantage to require attaching them to a class is to associate each one with a name; surely there's a simpler way to do that. And is there ever a good use case for putting both in the same class, given that the code transformer isn't going to run on the output of the AST transformer but rather on the output of all subsequent AST transformers and all preceding code transformers? Why not just let them be functions, and use the function name (or maybe have a separate attribute to override that, which a simple decorator can apply)? * Why does the code transformer only take consts and names? Surely you need varnames, and many of the other properties of code objects. And what's the use of lnotab if you can't set the base file and line? In fact, why not just pass a code object? * It seems like 99% of all ast_transformer methods are just going to construct and apply an ast.NodeTransformer subclass. Why not just register the NodeTransformer subclass? * The way it's written, it sounds like the main advantage of your proposal is that it makes it easier to write optimizations that need guards. But it also makes it easier to write the same kinds of optimizations that are already possible but a bit painful. It might be worth rewording a bit to make that clearer. * There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal. * In fact, I think this PEP could be useful even if the other two were rejected, if rewritten a bit. * It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode. For example, some language extensions add things that can't be parsed as a valid Python AST. This is particularly an issue when playing with new feature ideas. In some cases, a simple text preprocessor can convert it into code which can be compiled into AST nodes that you can then transform the way you want. At present, with import hooks being the best way to do any of these, there's no disparity that makes text transforms harder than AST transforms. But if we're going to have transformer objects with code_transformer and ast_transformer methods, but a text preprocessor still requires an import hook, that seems unfortunate. Is there a reason you can't add text_transformer as well? (And maybe bytes_transformer. And this would open the door to later add token_transformer in the same place--and for now, you can call tokenize, untokenize, and tokenize again inside a text_transformer.) * I like that I can now compile to PyCF_AST or to PyCF_TRANSFORMED_AST. But can I call compile with an untransformed AST and the PyCF_TRANSFORMED_AST flag? This would be useful if I had some things that still worked via import hook--I could choose whether to hook in before or after the standard/registered set--e.g., if I'm using a text transformer, or a CPython compiled with a hacked-up grammar that generates dummy AST nodes for new language productions, I may want to then transform those to real nodes before the optimizers get to them. (This would be less necessary if we had text-transformer.) * It seems like doing any non-trivial bytecode transforms will still require a third-party library like byteplay (which has trailed 2.6, 2.7, 3.x in general, and each new 3.x version by anywhere from 3 months to 4 years). Have you considered integrating some of that functionality into Python itself? Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful. * One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source; you have to manually recompile the source--making sure to use the same flags, globals, etc.--to get back to the AST. I think that will become even more of a problem now that you need separate ways to get the "basic" parse and the "post-all-installed-transformations" parse. Maybe this would be out of scope for your project, but having some way to access these rather than rebuild them could be very cool.

Wow, giant emails (as mine, ok). 2016-01-15 20:41 GMT+01:00 Andrew Barnert <abarnert@yahoo.com>:
* You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers.
The goal is to have a short optimizer tag. I'm not sure yet that it makes sense, but I would like to be able to transform AST and bytecode in a single code transformer. I prefer to add a single get/set function to sys, instead of two (4 new functions).
* Why are transformers objects with ast_transformer and code_transformer methods, but those methods don't take self?
They take self. It's just a formating issue (a mistake in the PEP) :-) They take self parameter, see examples. https://www.python.org/dev/peps/pep-0511/#bytecode-transformer It's just hard to format a PEP correctly when you know Sphinx :-) I started to use ".. method:: ..." but it doesn't work, it's the simpler reST format ;-)
It seems like the only advantage to require attaching them to a class is to associate each one with a name
I started with a function, but it's a little bit weird to set a name attribute to a function (func.name = "fat"). Moreover, it's convenient to store some data in the object. In fatoptimizer, I store the configuration. Even in the most simple AST transformer example of the PEP, the constructor creates an object: https://www.python.org/dev/peps/pep-0511/#id1 It may be possible to use functions, but classes are just more "natural" in Python.
And is there ever a good use case for putting both in the same class, given that the code transformer isn't going to run on the output of the AST transformer but rather on the output of all subsequent AST transformers and all preceding code transformers?
The two methods are disconnected, but they are linked by the optimizer tag. IMHO it makes sense to implement all optimizations (crazy stuff in AST, simple optimizer like peephole on bytecode) in a single code transformer. It avoids to use a long optimizer tag like "fat_ast-fat_bytecode". I also like short filenames.
* Why does the code transformer only take consts and names? Surely you need varnames, and many of the other properties of code objects. And what's the use of lnotab if you can't set the base file and line? In fact, why not just pass a code object?
To be honest, I don't feel confortable with a function taking 5 parameters which has to return a tuple of 4 items :-/ Especially if it's only the first version, we may have to add more items. code_transformer() API comes from PyCode_Optimize() API: the CPython peephole optimizer. PyAPI_FUNC(PyObject*) PyCode_Optimize(PyObject *code, PyObject* consts, PyObject *names, PyObject *lnotab); The function modifies lntotab in-place and returns the modified code. Passing a whole code object makes the API much simpler and code objects contain all information. I take your suggestion, thanks.
* It seems like 99% of all ast_transformer methods are just going to construct and apply an ast.NodeTransformer subclass. Why not just register the NodeTransformer subclass?
fatoptimizer doesn't use ast.NodeTransformer ;-) ast.NodeTransformer has a naive and inefficent design. For example, fatoptimizer uses a metaclass to only create the mapping of visitors once (visit_xxx methods). My transformer copies modified nodes to leave the input tree unchanged. I need this to be able to duplicate a tree later (to specialize functions). (Maybe I can proposed to enhance ast.NodeTransformer, but that's a different topic.)
* There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal.
I wrote "A preprocessor has various and different usages." Maybe I can elaborate :-) It looks like it is possible to "implement" f-string (PEP 498) using macros. I think that it's a good example of experimenting evolutions of the language (without having to modify the C code which is much more complex, Yury Selivanov may want to share his experience here for this async/await PEP ;-)).
* In fact, I think this PEP could be useful even if the other two were rejected, if rewritten a bit.
Yeah, I tried to split changes to make them independant. Only PEP 509 (dict version) is linked to PEP 510 (func specialize). Even alone, the PEP 509 can be used to implement the "copy globals to locals/constants" optimization mentioned in the PEP (at least two developers proposed changes to implement! it was also in Unladen Swallow plans).
I don't know this part of the compiler. Does Python already has an API to manipulate tokens, etc.? What about other Python implementations? I proposed AST transformers because it's already commonly used in the wild. I also proposed bytecode to replace the peephole optimizer: make it optional and maybe implement a new one (in Python to be more easily be maintenable?). The Hy language uses its own parser and emits Python AST. Why not using this design?
(...) e.g., if I'm using a text transformer, (...)
IHMO you are going too far and it becomes out of the scope of the PEP. You should also read the previous discussion: https://mail.python.org/pipermail/python-dev/2012-August/121309.html
* It seems like doing any non-trivial bytecode transforms will still require a third-party library like byteplay (which has trailed 2.6, 2.7, 3.x in general, and each new 3.x version by anywhere from 3 months to 4 years). Have you considered integrating some of that functionality into Python itself?
To be honest, right now, I'm focsed on fatoptimizer. I don't want to integrate it in the stdlib because: * it's incomplete: see the giant https://fatoptimizer.readthedocs.org/en/latest/todo.html list if you are bored * the stdlib is moving ... is not really moving... well, the development process is way too slow for such very young project * fatoptimizer still changes the Python semantics in subtle ways which should be tested in large applications and discussed point per point * etc. It's way too early to discuss that (at least for fatoptimizer). Since pip becomes standard, I don't think that it's real issue in practice.
Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful.
byteplay doesn't seem to be maintained anymore. Last commit in 2010... IHMO you can do the same than byteplay on the AST with much simpler code. I only mentioned some projects modifying bytecode to pick ideas of what can be done with a code transformer. I don't think that it's worth to add more examples than the two "Ni! Ni! Ni!" examples.
* One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source;
You should take a look at MacroPy, it looks like it has some crazy stuff to modify the AST and compile at runtime. I'm not sure, I never used MacroPy, I only read its documentation to generalize my PEP ;-) Modifying and recompiling the code at runtime (using AST, something higher level than bytecode) sounds like a Lisp feature and like JIT compiler, two cool stuff ;) Victor

Sent from my iPhone
On Jan 15, 2016, at 15:14, Victor Stinner <victor.stinner@gmail.com> wrote:
Wow, giant emails (as mine, ok).
Well, this is a big idea, so it needs a big breakfast. I mean a big email. :) But fortunately, you had great answers to most of my points, which means I can snip them out of this reply and make it not quite as giant.
But that doesn't work as soon as there are even two of them: the bytecode #0 no longer runs after ast #0, but after ast #1; similarly, bytecode #1 no longer runs after ast #1, but after bytecode #0. So, it seems like whatever benefits you get by keeping them coupled will be illusory.
I prefer to add a single get/set function to sys, instead of two (4 new functions).
That's a good point. (I suppose you could have a pair of get/set functions that each set multiple lists instead of one, but that isn't really any simpler than multiple get/set functions...)
It looks a lot less weird with a decorator `@transform('fat')` that sets it for you.
In general, sure. But for data that isn't accessible from outside, and only needs to be used in a single call, a simple function (with the option of a wrapping data in a closure) can be simpler. That's why so many decorators are functions that return a closure, not classes that build an object with a __call__ method. But more specifically to this case, after looking over your examples, maybe the class makes sense here.
Sure. It's just a matter of emphasis, and whether more of it would help sell your idea or not. From the other big reply you got, maybe it would even hurt selling it... So, your call.
I did an experiment last year where I tried to add the same feature two ways (Haskell-style operator partials, so you can write `(* 2)` instead of `lambda x: x * 2)` or `rpartial(mul, 2)` or whatever). First, I did all the steps to add it "for real", from the grammar through to the code generator. Second, I added a quick grammar hack to create a noop AST node, then did everything else in Python with an import hook--preprocessor the text to get the noop nodes, then preprocessing the AST to turn those into nodes that do the intended semantics. As you might expect, the second version took a lot less time, required debugging a lot fewer segfaults, etc. and if your proposal removed the need for the import hook, it would be even simpler (and cleaner, too).
Well, Python does have an API to manipulate tokens, but it involves manually tokenizing the text, modifying the token stream, untokenizing it back to text, and then parsing and compiling the result, which is far from ideal. (In fact, in some cases you even need to encode back to bytes.) There's an open enhancement issue to make it easier to write token processors. But don't worry about that part for now. A text preprocessor step should be very easy to add, and useful on its own (and it opens the door for adding a token preprocessor between text and AST in the future when that becomes feasible). I also mentioned a bytes preprocessor, which could munge the bytes before the decoding to text. But that seems a lot less useful. (Maybe if you needed an alternative to the coding-declaration syntax for some reason?) I only included it because it's another layer you can hook in an import hook today, so it seems like if it is left out, that should be an intentional decision, not just something nobody thought about.
I proposed AST transformers because it's already commonly used in the wild.
Text preprocessors are also used in the wild. IIRC, Guido mentioned having written one that turns Python 3-style annotations into something that compiles as legal Python 2.7 (although he later abandoned it, because it turned out to be too hard to integrate with their other Python 2 tools). (Token preprocessors are not used much I n the wild, because it's painful to write them, nor are bytes preprocessors, because they're not that useful.)
The Hy language uses its own parser and emits Python AST. Why not using this design?
By the same token, why not use your own code generator and emit Python bytecode, instead of just preprocessing ASTs? If you're making a radical change, that makes sense. But for most uses, where you only want to make a small change on top of the normal processing, it makes a lot more sense to just hook the normal processing than to completely reproduce everything it does.
Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful.
byteplay doesn't seem to be maintained anymore. Last commit in 2010...
There's a byteplay3 fork, which is maintained. But it doesn't support 3.5 yet. (As I mentioned, it's usually a few months to a few years behind each new Python release. Which is one reason integrating parts of it into the core might be nice. The dis module changes in 3.4 were basically integrating part of byteplay, and that part has paid off--the code in dis is automatically up to date with the compiler. There may be more you could do here. But probably it's out of scope for your project.)
IHMO you can do the same than byteplay on the AST with much simpler code.
If that's really true, then you shouldn't include code_transformers in the PEP at all. You're just making things more complicated, in multiple ways, to enable a feature you don't think anyone will ever need. However, based on my own experience, I think code transformers _are_ sometimes useful, but they usually require something like byteplay. Even just something as simple as removing an unnecessary jump instruction requires reordering the arguments of every other jump; something like merging two finally blocks would be a nightmare to do manually.
* One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source;
You should take a look at MacroPy,
Yes, I love MacroPy. But it doesn't provide the functionality I'm asking about here. (It _might_ be possible to write a macro that stores the AST on each function object; I haven't tried.) Anyway, the reason I bring it up is that it's trivial to write a decorator that byteplay-hacks a function after compilation, and not much harder to write one that text-hacks the source and recompiles it, but taking the AST and recompiling it is more painful. Since your proposal is about making similar things easier in other cases, it could be nice to do that here as well. But, as I said at the top, I realize some of these ideas are out of scope; some of them are more about getting a definite "yeah, that might be cool but it's out of scope" as opposed to not knowing whether it had even been considered.
Well, part of the point of Lisp is that there is only one step--effectively, your source bytes are your AST. Python has to decode, tokenize, and parse to get to the AST. But being able to start there instead of repeating that work would give us the best of both worlds (as easy to do stuff as Lisp, but as readable as Python).

Hi Victor, On 2016-01-15 11:10 AM, Victor Stinner wrote:
All your PEPs are very interesting, thanks for your hard work! I'm very happy to see that we're trying to make CPython faster. There are some comments below:
It's important to say that all of those issues (except 2506) are not bugs, but proposals to implement some nano- and micro- optimizations. Issue 2506 is about having an option to disable the peephole optimizer, which is a very narrow subset of what PEP 511 proposes to add. [..]
I think that most of those examples are rather weak. Things like tail-call optimizations, constants declarations, pattern matching, case classes (from MacroPy) are nice concepts, but they should be either directly implemented in Python language or not used at all (IMHO). Things like auto-changing dictionary literals to OrderedDict objects or in-Python DSLs will only help in creating hard to maintain code base. I say this because I have a first-hand experience with decorators that patch opcodes, and import hooks that rewrite AST. When you get back to your code years after it was written, you usually regret about doing those things. All in all, I think that adding a blessed API for preprocessors shouldn't be a focus of this PEP. MacroPy works right now with importlib, and I think it's a good solution for it. I propose to only expose new APIs on the C level, and explicitly mark them as provisional and experimental. It should be clear, that those APIs are only for *writing optimizers*, and nothing else. [off-topic] I do think that having a macro system similar to Rust might be a good idea. However, macro in Rust have explicit and distinct syntax, they have the necessary level of documentation and tooling. But this is a separate matter deserving its own PEP ;) [..]
Would it be possible to (or does it make any sense): 1. Add new APIs for AST transformers (only exposed on the C level!) 2. Remove the peephole optimizer. 3. Re-implement peephole optimizer using new APIs in CPython (peephole does some very basic optimizations). 4. Implement other basic optimizations (like limited constant folding) in CPython. 5. Leave the door open for you and other people to add more AST optimizers (so that FAT isn't locked to CPython's slow release cycle)? I also want to say this: I'm -1 on implementing all three PEPs until we see that FAT is able to give us at least 10% performance improvement on micro-benchmarks. We still have several months before 3.6beta to see if that's possible. Thanks, Yury

2016-01-15 21:39 GMT+01:00 Yury Selivanov <yselivanov.ml@gmail.com>:
All your PEPs are very interesting, thanks for your hard work! I'm very happy to see that we're trying to make CPython faster.
Thanks.
Hum, let me see.
* http://bugs.python.org/issue1346238 https://bugs.python.org/issue11549
"A constant folding optimization pass for the AST" & "Build-out an AST optimizer, moving some functionality out of the peephole optimizer" Well, that's a way to start working on larger optimizations. Anyway, the peephole optimizer has many limits. Raymond Hettinger keeps repeating that it was designed to be simple and limited. And each time, suggested to reimplement the peephole optimize in pure Python (as I'm proposing). On AST, we can do much better than just 1+1, even without changing the Python semantics. But I'm ok that speedup are minor on such changes. Without specialization and guards, you are limited.
"optimize out local variables at end of function" Alone, this optimization is not really interesting. But other optimizations can produce inefficient code. Example with loop unrolling: for i in range(2): print(i) is replaced with: i = 0 print(i) i = 1 print(i) with constant propagation, it becomes: i = 0 print(0) i = 1 print(1) at the point, i variable becomes useless and can removed the optimization mentioned in http://bugs.python.org/issue2181 print(0) print(1)
"AST Optimization: inlining of function calls" IMHO this one is really interesting. But again, not alone, but when combined with other optimizations.
At least, it allows to experiment new things. If a transformer becomes popular, we can start to discuss integrating into Python. About tail recursion, I recall that Guido wrote something about it: http://neopythonic.blogspot.fr/2009/04/tail-recursion-elimination.html I found a lot of code transformers projects. I understand that there is a real need. In a previous job, we used a text preprocessor to remove all calls to log.debug() to release the code to the production. It was in the embedded world (set top boxes), where performances matter. The preprocessor was based on long and unreliable regular expressions. I would prefer to use AST for that. That's my first item in the list: "Remove debug code like assertions and logs to make the code faster to run it for production."
To be honest, I don't plan to use such macros, they look too magic, and change Python semantics too much. But I dont want to restrict users to do cool things in their sandbox. In my experience, Python developers are good enough to make decision. When the f-string PEP was discussed, I was strongly opposed to allow *any* Python expressions in f-string. But Guido said that the language designers must not restrict users. Well, something like, I probably misuse his quote ;-)
Do you mean that we should add the feature but add a warning in the doc like "don't use it for evil things"? I don't think that we can forbid users for specific usage of an API. The only strong solution to ensure that users will not misuse an API is to not add the API (reject the PEP) :-) So I chose instead to document different kinds of usage of code transformers, just to know how they can be used.
Currently, the PEP adds: * -o OPTIM_TAG command line option * sys.implementation.optim_tag * sys.get_code_transformers() * sys.set_code_transformers(transformers) * ast.Constant * ast.PyCF_TRANSFORMED_AST importlib uses sys.implementation.optim_tag and sys.get_code_transformers(). *If* we want to remove them, we should find a way to expose these information to importlib. I really like ast.Constant, I would like to add it, but it's really a minor part of the PEP. I don't think that it's controversal. PyCF_TRANSFORMED_AST can only be exposed at the C level. "-o OPTIM_TAG command line option" is a shortcut to set sys.implementation.optim_tag. optim_tag can be set manually. But the problem is to be able to set the optim_tag before the first Python module is imported. It doesn't seem easy to avoid this change. According to Brett, the whole PEP can be simplified to this single command line option :-)
I agree that extending the Python syntax is out of the scope of the PEP 511.
FYI my fatoptimizer is quite slow. But it implements a lot of optimizations, much more than the Python peephole optimizer. I fear that the conversions are expensive: * AST (light) internal objects => Python (heavy) AST objects * (run AST optimizers implemented in Python) * Python (heavy) AST objects => AST (light) internal objects So in a near future, I prefer to keep the peephole optimizer implemented in C. The performance of the optimizer itself matters when you run a short script using "python script.py" (without compilation ahead of time).
I prefer to not start benchmarking fatoptimizer because I spent 3 months just to design the API, fix bugs, etc. I only few a small fraction of time on writing optimizations. I expect significan speedups with more optimizations like function inlining. If you are curious, take a look at the todo list: https://fatoptimizer.readthedocs.org/en/latest/todo.html I understand that an optimizer which does not produce faster code is not really interesting. My PEPs request many changes which become part of the public API and have to be maintained later. I already changed the PEP 509 and 510 to make the changes private (only visible in the C API). Victor

On 01/15/2016 05:10 PM, Victor Stinner wrote:
Victor, Thanks for your efforts on making Python faster! This PEP addresses two things that would benefit from different approaches: let's call them optimizers and extensions. Optimizers, such as your FAT, don't change Python semantics. They're designed to run on *all* code, including the standard library. It makes sense to register them as early in interpreter startup as possible, but if they're not registered, nothing breaks (things will just be slower). Experiments with future syntax (like when async/await was being developed) have the same needs. Syntax extensions, such as MacroPy or Hy, tend to target specific modules, with which they're closely coupled: The modules won't run without the transformer. And with other modules, the transformer either does nothing (as with MacroPy, hopefully), or would fail altogether (as with Hy). So, they would benefit from specific packages opting in. The effects of enabling them globally range from inefficiency (MacroPy) to failures or needing workarounds (Hy). The PEP is designed optimizers. It would be good to stick to that use case, at least as far as the registration is concerned. I suggest noting in the documentation that Python semantics *must* be preserved, and renaming the API, e.g.:: sys.set_global_optimizers([]) The "transformer" API can be used for syntax extensions as well, but the registration needs to be different so the effects are localized. For example it could be something like:: importlib.util.import_with_transformer( 'mypackage.specialmodule', MyTransformer()) or a special flag in packages:: __transformers_for_submodules__ = [MyTransformer()] or extendeding exec (which you actually might want to add to the PEP, to make giving examples easier):: exec("print('Hello World!')", transformers=[MyTransformer()]) or making it easier to write an import hook with them, etc... but all that would probably be out of scope for your PEP. Another thing: this snippet from the PEP sounds too verbose:: transformers = sys.get_code_transformers() transformers.insert(0, new_cool_transformer) sys.set_code_transformers(transformers) Can this just be a list, as with sys.path? Using the "optimizers" term:: sys.global_optimizers.insert(0, new_cool_transformer) This:: def code_transformer(code, consts, names, lnotab, context): It's a function, so it would be better to name it:: def transform_code(code): And this:: def ast_transformer(tree, context): might work better with keyword arguments:: def transform_ast(tree, *, filename, **kwargs): otherwise people might use context objects with other attributes than "filename", breaking when a future PEP assigns a specific meaning to them. It actually might be good to make the code transformer API extensible as well, and synchronize with the AST transformer:: def transform_code(code, *, filename, **kwargs):

On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote:
So, you'd have to supply the transformer used before importing? That seems like a troublesome solution to me. A better approach (to me) would require being able to document what transformers need to be run inside the module itself. Something like #:Transformers modname.TransformerClassName, modname.OtherTransformerClassName The reason why I would prefer this, is that it makes sense to document the transformers needed in the module itself, instead of in the code importing the module. As you suggest (and rightly so) to localize the effects of the registration, it makes sense to do the registration in the affected module. Of course there might be some cases where you want to import a module using a transformer it does not need to know about, but I think that would be less likely than the case where a module knows what transformers there should be applied. As an added bonus, it would let you apply transformers to the entry-point: #!/usr/bin/env python #:Transformers foo.BarTransformerMyCodeCanNotRunWithout But as you said, this support is probably outside the scope of the PEP anyway. Kind regards, Sjoerd Job

I'm a big fan of your motivation to build an optimizer for cPython code. What I'm struggling with is understanding why this requires a PEP and language modification. There are already several projects that manipulate the AST for performance gains such as [1] or even my own ham fisted attempt [2]. Would you please elaborate on why these external approaches fail and how language modifications would make your approach successful. [1] https://pypi.python.org/pypi/astoptimizer [2] http://pycc.readthedocs.org/en/latest/ On Sat, Jan 16, 2016, 10:30 Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:

On 17 January 2016 at 02:56, Kevin Conway <kevinjacobconway@gmail.com> wrote:
Existing external optimizers (include Victor's own astoptimizer, the venerable psyco, static compilers like Cython, and dynamic compilers like Numba) make simplifying assumptions that technically break some of Python's expected runtime semantics. They get away with that by relying on the assumption that people will only apply them in situations where the semantic differences don't matter. That's not good enough for optimization passes that are enabled globally: those need to be semantics *preserving*, so they can be applied blindly to any piece of Python code, with the worst possible outcome being "the optimization was automatically bypassed or disabled at runtime due to its prerequisites no longer being met". The PyPy JIT actually works in much the same way, it just does it dynamically at runtime by tracing frequently run execution paths. This is both a strength (it allows even more optimal code generation based on the actual running application), and a weakness (it requires time for the JIT to warm up by identifying critical execution paths, tracing them, and substituting the optimised code) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi, 2016-01-16 17:56 GMT+01:00 Kevin Conway <kevinjacobconway@gmail.com>:
Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST optimizers section of Prior Art. I wrote astoptimizer [1] and this project uses monkey-patching of the compile() function, I mentioned this monkey-patching hack in the rationale of the PEP: https://www.python.org/dev/peps/pep-0511/#rationale I would like to avoid monkey-patching because it causes various issues. The PEP 511 also makes transformations more visible: transformers are explicitly registered in sys.set_code_transformers() and the .pyc filename is modified when the code is transformed. It also adds a new feature: it becomes possible to run transformed code without having to register the tranformer at runtime. This is made possible with the addition of the -o command line option. Victor

I'm willing to take this conversation offline as it seems this thread has cooled down quite a bit. I would still like to hear more, though, about how adding this as a facility in the language improves over the current, external implementations of Python code optimizers. Python already has tools for reading in source files, parsing them into AST, modifying that AST, and writing the final bytecode to files as part of the standard library. I don't see anything in PEP0511 that improves upon that. Out of curiosity, do you consider this PEP as adding something to Python that didn't previously exist or do you consider this PEP to more aligned with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed to marshal the community in a common direction? I understand that you have other PEPs in flight that are designed to make certain optimizations easier (or possible). Looking at this PEP in isolation, however, leaves me wanting more explanation as to its value. You mention the need for monkey-patching or hooking into the import process as a part of the rational. The PyCC project, while it may not be the best example for optimizer design, does not need to patch or hook into any thing to function. Instead, it acts as an alternative bytecode compiler that drops .pyc just like the standard compiler would. Other than the trade-off of using a 3rd party library versus adding a -o flag, what significant advantage does a sys.add_optimizer() call provide? Again, I'm very much behind your motivation and hope you are incredibly successful in making Python a faster place to live. I'm only trying to get in your head and see what you see. On Wed, Jan 27, 2016 at 10:45 AM Victor Stinner <victor.stinner@gmail.com> wrote:

On Wed, 27 Jan 2016 at 20:57 Kevin Conway <kevinjacobconway@gmail.com> wrote:
The PEP is about empowering people to write AST transformers without having to use third-party tools to integrate it into their workflow. As you pointed out, there is very little here that isn't possible today with some toolchain that reads Python source code, translates it into an AST, optimizes it, and then writes out the .pyc file. But that all does require going to PyPI or writing your own solution. But if Victor's PEP gets in, then there will be a standard hook point that all Python code will go through which will make adding AST transformers much easier. Whether this ease of use is beneficial is part of the discussion around this PEP.
The -o addition is probably the biggest thing the PEP is proposing. The overwriting of .pyc files with optimizations that are not necessarily expected is not the best, so -o would allow for stopping the abuse of .pyc file naming. The AST registration parts is all just to make this stuff easier. -Brett

On Thursday, January 28, 2016 12:44 PM, Brett Cannon <brett@python.org> wrote:
On Wed, 27 Jan 2016 at 20:57 Kevin Conway <kevinjacobconway@gmail.com> wrote:
Out of curiosity, do you consider this PEP as adding something to Python that didn't previously exist or do you consider this PEP to more aligned with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed to marshal the community in a common direction? I understand that you have other PEPs in flight that are designed to make certain optimizations easier (or possible). Looking at this PEP in isolation, however, leaves me wanting more explanation as to its value.>
The PEP is about empowering people to write AST transformers without having to use third-party tools to integrate it into their workflow. As you pointed out, there is very little here that isn't possible today with some toolchain that reads Python source code, translates it into an AST, optimizes it, and then writes out the .pyc file. But that all does require going to PyPI or writing your own solution.
This kind of talk worries me. It's _already_ very easy to write AST transformers. There's no need for any third-party code from PyPI, and that "your own solution" that you have to write is a few lines of trivial code. I think a lot of people don't realize this. Maybe because they tried it in 2.6 or 3.2, where it was a lot harder, or because they read the source to MacroPy (which is compatible with 2.6 and 3.2, or at least originally was), where it looks very hard, or maybe just because they didn't realize how much work has already been put in to make it easy. But whatever the reason, they're wrong. And so they're expecting this PEP to solve a problem that doesn't need to be solved.
But if Victor's PEP gets in, then there will be a standard hook point that all Python code will go through which will make adding AST transformers much easier. Whether this ease of use is beneficial is part of the discussion around this PEP.
There already is a standard hook point that all Python code goes through. Writing an AST transformer is as simple as replacing the code that compiles source to bytecode with a 3-line function that compiles source to AST, calls your transformer, and compiles AST to bytecode. Processing source or bytecode instead of AST is just as easy (actually, one line shorter). Where it gets tricky is all the different variations on what you hook and how. Do you want to intercept all .py files? Or add a new extension, like .hy, instead? Or all source files, but only if they start with a magic marker line? How do you want to integrate with naming, finding, obsoleting, reading, and writing .pyc files? What about -O? And so on. And how do you want to work together with other libraries trying to do the same thing, which may have made slightly different decisions? Once you decide what you want, it's another few lines to write and install the hook that does that--the hard part is deciding what you want. If this PEP can solve the hard part in a general way, so that the right thing to do for different kinds of transformers will be obvious and easy, that would be great. If it can't do so, then it just shouldn't bother with anything that doesn't fit into its model of global semantic-free transformations. And that would also be great--making global semantic-free transformers easy is already a huge boon even if it doesn't do anything else, and keeping the design for that as simple as possible is better than making it more complex to partially solve other things in a way that only helps with the easiest parts.

On Thu, Jan 28, 2016 at 09:13:08PM +0000, Andrew Barnert via Python-ideas wrote:
I don't realise this. Not that I don't believe you, but I'd like to see a tutorial that goes through this step by step and actually explains what this is all about. Or, if it really is just a matter of a few lines, even just a simple example might help. For instance, the PEP includes a transformer that changes all string literals to "Ni! Ni! Ni!". Obviously it doesn't work as sys.set_code_transformers doesn't exist yet, but if I'm understanding you, we don't need that because it's already easy to apply that transformer. Can you show how? Something that works today? -- Steve

On Thursday, January 28, 2016 5:07 PM, Steven D'Aprano <steve@pearwood.info> wrote:
I agree, but someone (Brett?) on one of these threads explained that they don't include such a tutorial in the docs because they don't want to encourage people to screw around with import hooks too much, so... Anyway, I wrote a blog post about last year ( http://stupidpythonideas.blogspot.com/2015/06/hacking-python-without-hacking...), but I'll summarize it here. I'll show the simplest code for hooking in a source, AST, or bytecode transformer, not the most production-ready.
For instance, the PEP includes a transformer that changes all string
Sure. Here's an AST transformer: class NiTransformer(ast.NodeTransformer): def visit_Str(self, node): node.s = 'Ni! Ni! Ni!' return node Here's a complete loader implementation that uses the hook: class NiLoader(importlib.machinery.SourceFileLoader): def source_to_code(self, data, path, *, _optimize=-1): source = importlib._bootstrap.decode_source(data) tree = NiTransformer().visit(ast.parse(source, path, 'exec')) return compile(tree, path, 'exec') Now, how do you install the hook? That depends on what exactly you want to do. Let's say you want to make it globally hook all .py files, be transparent to .pyc generation, and ignore -O, and you'd prefer a monkeypatch hack that works on all versions 3.3+, rather than a clean spec-based finder that requires 3.5. Here goes: finder = sys.meta_path[-1] loader = finder.find_module(__file__) loader.source_to_code = NiLoader.source_to_code Just put all this code in your top level script, or just put it in a module and import that in your top level script, either way before importing anything else. (And yes, "before importing anything else" means some bits of the stdlib end up processed and some don't, just as with PEP 511.) You can see it in action at https://github.com/abarnert/nihack PEP 511 writes the NiLoader part for you, but, as you can see, that's the easiest part of the whole thing. If you want all the exact same choices that the PEP makes (global, .py files only, insert name into .pyc files, integrate with -O and -o, promise to be semantically neutral, etc.), it also makes the last part trivial, which is a much bigger deal. If you want any different choices, it doesn't help with the last part at all. (And I think that's fine, as long as that's the intention. Right now, someone has to have some idea of what they're doing to use my hack, and that's probably a good thing, right? And if I want to clean it up and make it distributable, like MacroPy, I'd better know how to write a spec finder or I have no business distributing any such thing. But if people want to experiment with optimizers that don't actually change the behavior of their code, that's a lot safer, so it seems reasonable that we should focus on making that easier.)

On Thursday, January 28, 2016 7:10 PM, Andrew Barnert <abarnert@yahoo.com> wrote: <snip> Immediately after sending that, I realized that Victor's PEP uses a bytecode transform rather than an AST transform. That isn't much harder to do today. Here's a quick, untested version: def ni_transform(c): consts = [] for const in c.co_consts: if isinstance(c, str): consts.append('Ni! Ni! Ni!') elif isinstance(c, types.CodeType): consts.append(ni_transform(const)) else: consts.append(const) return types.CodeType( c.co_argcount, c.co_kwonlyargcount, c.co_nlocals, c.co_stacksize, c.co_flags, c.co_code, tuple(consts), c.co_names, c.co_varnames, c.co_filename, c.co_name, c.co_firstlineno, c.co_lnotab, c.co_freevars, c.co_cellvars) class NiLoader(importlib.machinery.SourceFileLoader): def source_to_code(self, data, path, *, _optimize=-1): return ni_transform(compile(data, path, 'exec')) You may still need the decode_source bit, at least on some of the Python versions; I can't remember. If so, add that one line from the AST version. Installing the hook is the same as the AST version. You may notice that I have that horrible 18-argument constructor, and the PEP doesn't. But that's because the PEP is basically cheating with this example. For some reason, it passes 3 of those arguments separately--consts, names, and lnotab. If you modify anything else, you'll need the same horrible constructor. And, in any realistic bytecode transformer, you will need to modify something else. For example, you may want to transform the bytecode. And meanwhile, once you start actually transforming bytecode, that becomes the hard part, and PEP 511 won't help you there. If you just want to replace every LOAD_GLOBAL with a LOAD_CONST, you can do that in a pretty simple loop with a bit of help from the dis module. But if you want to insert and delete bytecodes like the existing peephole optimizer in C does, then you're also dealing with renumbering jump targets and rebuilding the lnotab and other fun things. And if you start dealing with opcodes that change the stack effect nonlocally, like with and finally handlers, you'd have to be an idiot or a masochist to not reach for a third-party library like byteplay. (I know this because I'm enough of an idiot to have done it once, but not enough of an idiot or a masochist to do it again...). So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.)

On 29 January 2016 at 13:30, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.)
Rather than trying to categorise things as "hard" or "easy", I find it to be more helpful to categorise them as "inherent complexity" or "incidental complexity". With inherent complexity, you can never eliminate it, only move it around, and perhaps make it easier to hide from people who don't care about the topic (cf. the helper classes in importlib, which hide a lot of the inherent complexity of the import system). With incidental complexity though, you may be able to find ways to eliminate it entirely. For a lot of code transformations, determining a suitable scope of application is *inherent* complexity: you need to care about where the transformation is applied, as it actually matters for that particular use case. For semantically significant transforms, scope of application is inherent complexity, as it affects code readability, and may even be an error if applied inappropriately. This is why: - the finer-grained control offered by decorators is often preferred to metaclasses or import hooks - custom file extensions or in-file markers are typically used to opt in to import hook processing In these cases, whether or not the standard library is processed doesn't matter, since it will never use the relevant decorator, file extension or in-file marker. You also don't need to worry about subtle start-up bugs, since if the decorator isn't imported, or the relevant import hook isn't installed appropriately, then the code that depends on that happening simply won't run. This means the only code transformation cases where determining scope of applicability turns out to be *incidental* complexity are those that are intended to be semantically neutral operations. Maybe you're collecting statistics on opcode frequency, maybe you're actually applying safe optimisations, maybe you're doing something else, but the one thing you're promising is that if the transformation breaks code that works without the transformation applied, then it's a *bug in the transformer*, not the code being transformed. In these cases, you *do* care about whether or not the standard library is processed, so you want an easy way to say "I want to process *all* the code, wherever it comes from". At the moment, that easy way doesn't exist, so you either give up, or you mess about with the encodings.py hack. PEP 511 erases that piece of incidental complexity and say, "If you want to apply a genuinely global transformation, this is how you do it". The fact we already have decorators and import hooks is why I think PEP 511 can safely ignore the use cases that those handle. However, I think it *would* make sense to make the creation of a "Code Transformation" HOWTO guide part of the PEP - having a guide means we can clearly present the hierarchy in terms of: - decorators are strongly encouraged, since the maintainability harm they can do is limited - for import hooks, the use of custom file extensions and in-file markers is strongly encouraged to limit unintended side effects - global transformation are incredibly powerful, but also very hard to do well. Transform responsibly, or future maintainers will not think well of you :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 29, 2016, at 06:10, Nick Coghlan <ncoghlan@gmail.com> wrote:
I think this is the conclusion I was hoping to reach, but wasn't sure how to get there. I'm happy with PEP 511 not trying to serve cases like MacroPy and Hy and the example from the byteplay docs, especially so if ignoring them makes PEP 511 simpler, as long as it can explain why it's ignoring them. And a shorter version of your argument should serve as such an explanation. But the other half of my point was that too many people (even very experienced developers like most of the people on this list) think there's more incidental complexity than there is, and that's also a problem. For example, "I want to write a global processor for local experimentation purposes so I can play with my idea before posting it to Python-ideas" is not a bad desire. And, if people think it's way too hard to do with a quick&dirty import hook, they're naturally going to ask why PEP 511 doesn't help them out by adding a bunch of options to install/run the processors conditionally, handle non-.py files, skip the stdlib, etc. And I think the PEP is better without those options.
I like this idea. Earlier I suggested that the import system documentation should have some simple examples of how to actually use the import system to write transforming hooks. Someone (Brett?) pointed out that it's a dangerous technique, and making it too easy for people to play with it without understanding it may be a bad idea. And they're probably right. A HOWTO is a bit more "out-of-the-way" than library or reference docs--and, more importantly, it also has room to explain when you shouldn't do this or that, and why. I'm not sure it has to be part of the PEP, but I can see the connection. While the PEP helps by separating out the most important safe case (semantically-neutral, reflected in .pyc, globally consistent, etc.), but it also makes the question "how do I do something similar to PEP 511 transformers except ___" more likely to come up in the first place, making the HOWTO more important.

Some feedback on: https://www.python.org/dev/peps/pep-0511/#usage-3-disable-all-optimization Where do I put this specific piece of code (sys.set_code_transformers([]))? Best, Sven

2016-01-28 17:50 GMT+01:00 Sven R. Kunze <srkunze@mail.de>:
It's better to use -o noopt command, but if you want to call directly sys.set_code_transformers(), you have to call it before the first import. Example of app.py: -- sys.set_code_transformers([]) import module module.main() -- Victor

2016-01-28 17:57 GMT+01:00 Sven R. Kunze <srkunze@mail.de>:
I suspected that. So, where is this place of "before the first" import?
I don't understand your question. I guess that your real question is: are stdlib modules loaded with peephole optimizer enabled or not? If you use -o noopt, you are safe: the peephole optimizer is disabled before the first Python import. If you use sys.set_code_transformers([]) in your code, it's likely that Python already imported 20 or 40 modules during its initialization (especially in the site module). It's up to you to pick the best option. There are different usages for each option. Maybe you just don't care of the stdlib, you only want to debug your application code, so it's doesn't matter how the stlidb is optimized? -- Or are you asking me to remove sys.set_code_transformers([]) from the section "Usage 3: Disable all optimization"? I don't understand. Victor

On 28.01.2016 18:03, Victor Stinner wrote: proposed to make a difference between - local transformation - global transformation I can understand the motivation to have the same API for both, but's inherently different and it makes talking about it hard (as we can see now). I would like to have this clarified in the PEP (use consistent wording) or even split it up into two different parts of the PEP. You said I would need to call the function before all imports. Why is that? Can I not call it it twice in the same file? Or in a loop? What will happen? Will the file get recompiled each time? Some people proposed a "from __extensions__ import my_extension"; inspired by __future__ imports, i.e. it is forced to be at the top. Why? Because it somehow makes sense to perform all transformations the first time a file is loaded. I don't see that addressed in the PEP. I have to admit I would prefer this kind usage over a function call. Furthermore: - we already have import hooks. They can be used for local transformation. I don't see that addressed in the PEP. - after re-reading the PEP I have some difficulties to see how to activate, say, 2 custom transformers **globally** (via -o). Maybe, adding an example would help here.

A local transformation requires to register a global code transformer, but it doesn't mean that all files will be modified. The code transformer can use various kinds of checks to decide if a file must be transformed and then which parts of the code should be transformed. Decorators was suggested as a good granularity. Victor 2016-01-29 1:27 GMT+01:00 Greg Ewing <greg.ewing@canterbury.ac.nz>:

On 29.01.2016 01:57, Victor Stinner wrote:
A local transformation requires to register a global code transformer, but it doesn't mean that all files will be modified.
I think you should differentiate between "register" and "use". "register" basically means "provide but don't use". "use" basically means "apply the transformation". (Same is already true for codecs.) The PEP's "set_code_transformers" seem not to make that distinction.
As others pointed out, implicit transformations are not desirable. So, why would a transformer need to check if a file must be transformed? Either the author of a file explicitly wants the transformer or not. Same goes for the global option. Either it is there or it isn't. Btw. I would really appreciate a reply to my prior post. ;) Best, Sven

Sven R. Kunze wrote:
To elaborate on that a bit, something like an __extensions__ magic import could first be prototyped as a global transformer. If the idea caught on, that transformer could be made "official", meaning it was incorporated into the stdlib and applied by default. -- Greg

On Sat, Jan 16, 2016 at 12:22 PM, Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
+1 for this (but see below). This is the approach I used when playing with import hooks as shown in http://aroberge.blogspot.ca/2015/10/from-experimental-import-somethingnew_14... and a few other posts I wrote about similar transformations.
While I would like to see some standard way to apply code transformations, I agree that this is likely (and unfortunately) outside the scope of this PEP. André Roberge

On 17 January 2016 at 02:22, Sjoerd Job Postmus <sjoerdjob@sjec.nl> wrote:
I think Sjoerd's confusion here is a strong argument in favour of clearly and permanently distinguishing semantics preserving code optimizers (which can be sensibly applied externally and/or globally), and semantically significant code transformers (which also need to be taken into account when *reading* the code, and hence should be visible locally, at least at the module level, and often at the function level). Making that distinction means we can be clear that the transformation case is already well served by import hooks that process alternate filename extensions rather than standard Python source or bytecode files, encoding cookie tricks (which are visible as a comment in the module header), and function decorators that alter the semantics of the functions they're applied to. The case which *isn't* currently well served is transparently applying a semantics preserving code optimiser like FAT Python - that's a decision for the person *running* the code, rather than the person writing it, so this PEP is about providing the hooks at the interpreter level to let them do that. While we can't *prevent* people from using these new hooks with semantically significant transformers, we *can* make it clear that we think actually doing is a bad idea, as it is likely to result in a tightly coupled hard to maintain code base that can't even be read reliably without understand the transforms that are being implicitly applied. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 16, 2016, at 19:49, Nick Coghlan <ncoghlan@gmail.com> wrote:
I think something that isn't made clear in the rationale is why an import hook is good enough for most semantic extensions, but isn't good enough for global optimizers. After all, it's not that hard to write a module that installs an import hook for normal .py files instead of .hy or .pyq or whatever files. Then, to optimize your own code, or a third-party library, you just import the optimizer module first; to optimize an application, you write a 2- or 3-line wrapper (which can be trivially automated a la setuptools entry point scripts) to import the optimizer and then start the app. There are good reasons that isn't sufficient. For example, parts of the stdlib have already been imported before the top of the main module. While there are ways around that (I believe FAT comes with a script to recompile the stdlib into a venv or something?), they're clumsy and ad hoc, and it's unlikely two different optimizers would play nicely together. Also, making it work in a sensible way with .pyc files takes a decent amount of code, and will again be an ad-hoc solution that won't play well with other projects doing similar things. And there are people who write and execute long-running, optimization-ripe bits of code in the REPL (or at least in an IPython notebook), and that can't be handled with an import hook. Nor can code that extensively used exec. And probably other reasons I haven't thought of. Maybe the PEP should explain those reasons, so it's clear why this feature will help projects like FAT. Then again, some of those same reasons seem to apply equally well to semantic extensions. Two extensions are no more likely to play together as import hooks than two optimizers, and yet in many cases there's no syntactic or semantic reason they couldn't. Extensions are probably even more useful than optimizations at the REPL. And so on. And this is all even more true for extensions that people write to explore a new feature idea than for things people want to publish as deployable code. So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes, even if people only want one of them served (normally, Python doesn't go out of its way to prevent writing certain kinds of code, it just becomes accepted that such code is not idiomatic; only when there's a real danger of attractive nuisance is the language modified to ban it), and I think it's potentially a positive.

On 17 January 2016 at 14:28, Andrew Barnert <abarnert@yahoo.com> wrote:
So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes,
The main problem with globally enabled transformations of any kind is that action at a distance in software design is generally a *bad thing*. Python's tolerant of it because sometimes it's a *necessary* thing that actually makes code more maintainable - using monkeypatching for use cases like testing and monitoring means those cases can be ignored when reading and writing the code, using metaclasses lets you enlist the interpreter in defining "class-like" objects that differ in some specific way from normal ones (e.g. ORMs, ABCs, enums), using codecs lets you more easily provide configurable encoding and decoding behaviour, etc. While relying too heavily on those kinds of features can significantly harm debuggability, the pay-off in readability is worth it often enough for them to be officially supported language and runtime features. The kind of code transformation hooks that Victor is talking about here are the ultimate in action at a distance - if it wants to, an "optimizer" can completely throw away your code and substitute its own. Import hooks do indeed give you a comparable level of power (at least if you go so far as to write your own meta_path hook), but also still miss the code that Python runs without importing it (__main__, exec, eval, runpy, etc).
even if people only want one of them served (normally, Python doesn't go out of its way to prevent writing certain kinds of code, it just becomes accepted that such code is not idiomatic; only when there's a real danger of attractive nuisance is the language modified to ban it), and I think it's potentially a positive.
That's all I'm suggesting - I think the proposed hooks should be designed for globally enabled optimizations (and named accordingly), but I don't think we should erect any specific barriers against using them for other things. Designing them that way will provide a healthy nudge towards the primary intended use case (transparently enabling semantically compatible code optimizations), while still providing a new transformation technique to projects like MacroPy. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Jan 16, 2016, at 22:06, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
OK, then I agree 100% on this part. But on the main point, I still think it's important for the PEP to explain why import hooks aren't good enough for semantically-neutral global optimizations. As I said, I can think of multiple answers (top-level code, interaction with .pyc files, etc.), but as long as the PEP doesn't give those answers, people are going to keep asking (even years from now, when people want to know why TOOWTDI didn't apply here).

Concerning ways to allow a module to opt in to transformations that change semantics, my first thought was to use an import from a magic module: from __extensions__ import modulename This would have to appear before any other statements or non-magic imports, like __future__ does. The named module would be imported at compile time and some suitable convention used to extract transformers from it. The problem is that if your extension is in a package, you want to be able to write from __extensions__ import packagename.modulename which is not valid syntax. So instead of a magic module, maybe a magic namespace package: import __extensions__.packagename.modulename -- Greg

2016-01-16 12:06 GMT+01:00 Petr Viktorin <encukou@gmail.com <javascript:;>>:
To be clear, Hylang will not benefit from my PEP. That's why it is not mentioned in the PEP. "Syntax extensions" only look like a special case of optimizers. I'm not sure that it's worth to make them really different.
I would prefer to not restrict the PEP to a specific usage.
Brett may help on this part. I don't think that it's the best way to use importlib. importlib is already pluggable. As I wrote in the PEP, MacroPy uses an import hook. (Maybe it should continue to use an import hook?)
or a special flag in packages::
__transformers_for_submodules__ = [MyTransformer()]
Does it mean that you have to parse a .py file to then decide how to transform it? It will slow down compilation of code not using transformers. I would prefer to do that differently: always register transformers very early, but configure each transformer to only apply it on some files. The transformer can use the filename (file extension? importlib is currently restricted to .py files by default no?), it can use a special variable in the file (ex: fatoptimizer searchs for a __fatoptimizer__ variable which is used to configure the optimizer), a configuration loaded when the transformer is created, etc.
There are a lot of ways to load, compile and execute code. Starting to add optional parameters will end as my old PEP 410 ( https://www.python.org/dev/peps/pep-0410/ ) which was rejected because it added an optional parameter a lot of functions (at least 16 functions!). (It was not the only reason to reject the PEP.) Brett Canon proposed to add hooks to importlib, but it would restrict the feature to imports. See use cases in the PEP, I would like to use the same code transformers everywhere.
set_code_transformers() checks the transformer name and ensures that the transformer has at least a AST transformer or a bytecode transformer. That's why it's a function and not a simple list. set_code_transformers() also gets the AST and bytecode transformers methods only once, to provide a simple C structure for PyAST_CompileObject (bytecode transformers) and PyParser_ASTFromStringObject (AST transformers). Note: sys.implementation.cache_tag is modifiable without any check. If you mess it, importlib will probably fail badly. And the newly added sys.implementation.optim_tag can also be modified without any check.
Fair enough :-) But I want the context parameter to pass additional information. Note: if we pass a code object, the filename is already in the code object, but there are other informations (see below).
The idea of a context object is to be "future-proof". Future versions of Python can add new attributes without having to modify all code transformers (or even worse, having to use kind of "#ifdef" in the code depending on the Python version).
**kwargs and context is basically the same, but I prefer a single parameter rather than an ugly **kwargs. IMHO "**kwargs" cannot be called an API. By the way, I added lately the bytecode transformers to the PEP. In fact, we already can more informations to its context: * compiler_flags: flags like * optimization_level (int): 0, 1 or 2 depending on the -O and -OO command line options * interactive (boolean): True if interactive mode * etc. => see the compiler structure in Python/compile.c. We will have to check that these attributes make sense to other Python implementations, or make it clear in the PEP that as sys.implementation, each Python implementation can add specific attributes, and only a few of them are always available. Victor

On 17 January 2016 at 21:48, Victor Stinner <victor.stinner@gmail.com> wrote:
The problem I see with making the documentation and naming too generic is that people won't know what the feature is useful for - a generic term like "transformer" accurately describes these units of code, but provides no hint as to why a developer might care about their existence. However, if the reason we're adding the capability is to make global static optimizers feasible, then we cam describe it accordingly (so the answer to "Why does this feature exist?" becomes relatively self evident), and have the fact that the feature can actually be used for arbitrary transforms be an added bonus rather than the core intent. Alternatively, we could follow the example of the atexit module, and provide these hook registration capabilities through a new "atcompile" module rather than through the sys module. Doing that would also provide a namespace for doing things like allowing runtime caching of compiled code objects - if there's no caching mechanism, then optimising code compiled at runtime (rather than loading pre-optimised code from bytecode files) could easily turn into a pessimisation if the optimiser takes more time to run than is gained back in a single execution of the optimised code relative to the unoptimised code. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 17 January 2016 at 21:48, Victor Stinner<victor.stinner@gmail.com> wrote:
I would prefer to not restrict the PEP to a specific usage.
On Mon, Jan 18, 2016, at 11:45, Yury Selivanov wrote:
+1.
I think that it depends on how it's implemented. Having a _requirement_ that semantics _must_ be preserved suggests that they may not always be applied, or may not be applied in a deterministic order.

[..]
It just won't be possible to enforce that "requirement". What Nick suggests (and I suggested in my email earlier in this thread) is that we should name the APIs clearly to avoid any confusion. `sys.set_code_transformers` is less clear about what it should be used for than `sys.set_code_optimizers`. Yury

On Mon, Jan 18, 2016, at 12:04, Yury Selivanov wrote:
I'm not talking about mechanically enforcing it. I'm talking about it being a documented requirement to write such code, and that people *should not* use this feature for things that need to be applied 100% of the time for their applications to work. Either we have to nail down exactly when and how these things are invoked so that people can rely on them, or they are only _useful_ for optimizations (and other semantic-preserving things like instrumentation) rather than arbitrary transformations.

On Jan 17, 2016, at 03:48, Victor Stinner <victor.stinner@gmail.com> wrote:
At that point, you're exactly duplicating what can be done with import hooks. I think this is part of the reason Nick suggested the PEP should largely ignore the issue of syntax extensions and experiments: because then you don't have to solve Petr's problem. Globally-applicable optimizers are either on or off globally, so the only API you need to control them is a simple global list. The fact that this same API works for some uses of extensions doesn't matter; the fact that it doesn't work for some other uses of extensions also doesn't matter; just design it for the intended use.
The transformer can use the filename (file extension? importlib is currently restricted to .py files by default no?),
Everything goes through the same import machinery. The usual importer gets registered for .py files. Something like hylang can register for a different extension. Something like PyMacro can wrap the usual importer, then register to take over for .py files. (This isn't quite what PyMacro does, because it's designed to work with older versions of Python, with less powerful/simple customization opportunities, but it's what a new PyMacro-like project would do.) A global optimizer could also be written that way today. And doing this is a couple dozen lines of code (or about 5 lines to do it as a quick&dirty hack without worrying about portability or backward/forward compatibility). The reason your PEP is necessary, I believe, is to overcome the limitations of such an import hook: to work at the REPL/notebook/etc. level, to allow multiple optimizers to play nicely without them having to agree on some wrapping protocol, to work with exec, etc. By keeping things simple and only serving the global case, you can (or, rather, you already have) come up with easier solutions to those issues--no need for enabling/disabling files by type or other information, no need for extra optional parameters to exec, etc. (Or, if you aren't trying to overcome those limitations, then I'm not sure why your PEP is necessary. Import hooks already work, after all.)
That doesn't seem necessary. After all, sys.path doesn't check that you aren't assigning non-strings or strings that don't make valid paths, and nobody has ever complained that it's too hard to debug the case where you write `sys.paths.insert(0, {1, 2, 3})` because the error comes at import time instead of locally.

On 01/17/2016 12:48 PM, Victor Stinner wrote:
There is an important difference: optimizers should be installed globally. But modules that don't opt in to a specific syntax extension should not get compiled with it.
My API examples seem to have led the conversation astray. The point I wanted to make is that "syntax extensions" need a registration API that only enables them for specific modules. I admit the particular examples weren't very well thought out. I'm not proposing adding *any* of them to the PEP: I'd be happy if the PEP stuck to the "optimizers" use case and do that well. The "extensions" case is worth another PEP, which can reuse the transformers API (probably integrating it with importlib), but not the registration API.
Why very early? If a syntax extension is used in some package, it should only be activated right before that package is imported. And ideally it shouldn't get a chance to be activated on other packages. importlib is not restricted to .py (it can do .zip, .pyc, .so, etc. out of the box). Actually, with import hooks, the *source* file that uses the DSL can use a different extension (as opposed to the *.pyc getting a different tag, as for optimizers). For example, a library using a SQL DSL could look like:: __init__.py (imports a package to set up the transformer) queries.sqlpy __pycache__/ __init__.cpython-36.opt-0.pyc queries.cpython-36.opt-0.pyc That is probably what you want for syntax extensions. You can't really look at special variables in the file, because the transformer needs to be enabled before the code is compiled -- especially if text/tokenstream transforms are added, so the file might not be valid "vanilla Python". What's left is making it easy to register an import hook with a specific PEP 511 transformer -- but again, that can be a different PEP.
participants (15)
-
Andre Roberge
-
Andrew Barnert
-
Brett Cannon
-
Chris Angelico
-
Ethan Furman
-
Greg Ewing
-
Kevin Conway
-
Nick Coghlan
-
Petr Viktorin
-
Random832
-
Sjoerd Job Postmus
-
Steven D'Aprano
-
Sven R. Kunze
-
Victor Stinner
-
Yury Selivanov