[Python-checkins] peps: PEP 488: make the base case not change the bytecode file name

brett.cannon python-checkins at python.org
Fri Mar 20 19:28:19 CET 2015


https://hg.python.org/peps/rev/062ca9bdf9da
changeset:   5736:062ca9bdf9da
parent:      5733:d6d28b20939f
user:        Brett Cannon <brett at python.org>
date:        Fri Mar 20 14:27:56 2015 -0400
summary:
  PEP 488: make the base case not change the bytecode file name

files:
  pep-0488.txt |  120 +++++++++++++++++---------------------
  1 files changed, 54 insertions(+), 66 deletions(-)


diff --git a/pep-0488.txt b/pep-0488.txt
--- a/pep-0488.txt
+++ b/pep-0488.txt
@@ -17,8 +17,8 @@
 This PEP proposes eliminating the concept of PYO files from Python.
 To continue the support of the separation of bytecode files based on
 their optimization level, this PEP proposes extending the PYC file
-name to include the optimization level in bytecode repository
-directory (i.e., the ``__pycache__`` directory).
+name to include the optimization level in the bytecode repository
+directory when it's called for (i.e., the ``__pycache__`` directory).
 
 
 Rationale
@@ -29,11 +29,11 @@
 optimization level is specified at interpreter startup (i.e., ``-O``
 is not specified). A PYO file represents the bytecode file that is
 read/written when **any** optimization level is specified (i.e., when
-``-O`` is specified, including ``-OO``). This means that while PYC
+``-O`` **or** ``-OO`` is specified). This means that while PYC
 files clearly delineate the optimization level used when they were
 generated -- namely no optimizations beyond the peepholer -- the same
-is not true for PYO files. Put in terms of optimization levels and
-the file extension:
+is not true for PYO files. To put this in terms of optimization
+levels and the file extension:
 
   - 0: ``.pyc``
   - 1 (``-O``): ``.pyo``
@@ -62,7 +62,9 @@
 
 As for distributing bytecode-only modules, having to distribute both
 ``.pyc`` and ``.pyo`` files is unnecessary for the common use-case
-of code obfuscation and smaller file deployments.
+of code obfuscation and smaller file deployments. This means that
+bytecode-only modules will only load from their non-optimized
+``.pyc`` file name.
 
 
 Proposal
@@ -73,15 +75,22 @@
 file extension. To allow for the optimization level to be unambiguous
 as well as to avoid having to regenerate optimized bytecode files
 needlessly in the `__pycache__` directory, the optimization level
-used to generate a PYC file will be incorporated into the bytecode
-file name. Currently bytecode file names are created by
+used to generate the bytecode file will be incorporated into the
+bytecode file name. When no optimization level is specified, the
+pre-PEP ``.pyc`` file name will be used (i.e., no change in file name
+semantics). This increases backwards-compatibility while also being
+more understanding of Python implementations which have no use for
+optimization levels (e.g., PyPy[10]_).
+
+Currently bytecode file names are created by
 ``importlib.util.cache_from_source()``, approximately using the
 following expression defined by PEP 3147 [3]_, [4]_, [5]_::
 
     '{name}.{cache_tag}.pyc'.format(name=module_name,
                                     cache_tag=sys.implementation.cache_tag)
 
-This PEP proposes to change the expression to::
+This PEP proposes to change the expression when an optimization
+level is specified to::
 
     '{name}.{cache_tag}.opt-{optimization}.pyc'.format(
             name=module_name,
@@ -94,8 +103,8 @@
 bytecode file names based on module name and cache tag which will
 not vary for a single interpreter. The "opt-" prefix was chosen over
 "o" so as to be somewhat self-documenting. The "opt-" prefix was
-chosen over "O" so as to not have any confusion with "0" while being
-so close to the interpreter version number.
+chosen over "O" so as to not have any confusion in case "0" was the
+leading prefix of the optimization level.
 
 A period was chosen over a hyphen as a separator so as to distinguish
 clearly that the optimization level is not part of the interpreter
@@ -103,10 +112,8 @@
 the period in the file name to delineate semantically different
 concepts.
 
-For example, the bytecode file name of ``importlib.cpython-35.pyc``
-would become ``importlib.cpython-35.opt-0.pyc``. If ``-OO`` had been
-passed to the interpreter then instead of
-``importlib.cpython-35.pyo`` the file name would be
+For example, if ``-OO`` had been passed to the interpreter then instead
+of ``importlib.cpython-35.pyo`` the file name would be
 ``importlib.cpython-35.opt-2.pyc``.
 
 It should be noted that this change in no way affects the performance
@@ -114,9 +121,15 @@
 based on the optimization level of the interpreter already and
 generates a new bytecode file if it doesn't exist, the introduction
 of potentially more bytecode files in the ``__pycache__`` directory
-has no effect. The interpreter will continue to look for only a
-single bytecode file based on the optimization level and thus no
-increase in stat calls will occur.
+has no effect in terms of stat calls. The interpreter will continue
+to look for only a single bytecode file based on the optimization
+level and thus no increase in stat calls will occur.
+
+The only potentially negative result of this PEP is the probable
+increase in the number of ``.pyc`` files and thus increase in storage
+use. But for platforms where this is an issue,
+``sys.dont_write_bytecode`` exists to turn off bytecode generation so
+that it can be controlled offline.
 
 
 Implementation
@@ -139,18 +152,18 @@
 The introduced ``optimization`` keyword-only parameter will control
 what optimization level is specified in the file name. If the
 argument is ``None`` then the current optimization level of the
-interpreter will be assumed. Any argument given for ``optimization``
-will be passed to ``str()`` and must have ``str.isalnum()`` be true,
-else ``ValueError`` will be raised (this prevents invalid characters
-being used in the file name). If the empty string is passed in for
-``optimization`` then the addition of the optimization will be
-suppressed, reverting to the file name format which predates this
-PEP.
+interpreter will be assumed (including no optimization). Any argument
+given for ``optimization`` will be passed to ``str()`` and must have
+``str.isalnum()`` be true, else ``ValueError`` will be raised (this
+prevents invalid characters being used in the file name). If the
+empty string is passed in for ``optimization`` then the addition of
+the optimization will be suppressed, reverting to the file name
+format which predates this PEP.
 
-It is expected that beyond Python's own
-0-2 optimization levels, third-party code will use a hash of
-optimization names to specify the optimization level, e.g.
-``hashlib.sha256(','.join(['dead code elimination', 'constant folding'])).hexdigest()``.
+It is expected that beyond Python's own two optimization levels,
+third-party code will use a hash of optimization names to specify the
+optimization level, e.g.
+``hashlib.sha256(','.join(['no dead code', 'const folding'])).hexdigest()``.
 While this might lead to long file names, it is assumed that most
 users never look at the contents of the __pycache__ directory and so
 this won't be an issue.
@@ -238,15 +251,15 @@
 the cache tag and file extension is not critical. All options which
 have been considered are:
 
-* ``importlib.cpython-35.opt-0.pyc``
-* ``importlib.cpython-35.opt0.pyc``
-* ``importlib.cpython-35.o0.pyc``
-* ``importlib.cpython-35.O0.pyc``
-* ``importlib.cpython-35.0.pyc``
-* ``importlib.cpython-35-O0.pyc``
-* ``importlib.O0.cpython-35.pyc``
-* ``importlib.o0.cpython-35.pyc``
-* ``importlib.0.cpython-35.pyc``
+* ``importlib.cpython-35.opt-1.pyc``
+* ``importlib.cpython-35.opt1.pyc``
+* ``importlib.cpython-35.o1.pyc``
+* ``importlib.cpython-35.O1.pyc``
+* ``importlib.cpython-35.1.pyc``
+* ``importlib.cpython-35-O1.pyc``
+* ``importlib.O1.cpython-35.pyc``
+* ``importlib.o1.cpython-35.pyc``
+* ``importlib.1.cpython-35.pyc``
 
 These were initially rejected either because they would change the
 sort order of bytecode files, possible ambiguity with the cache tag,
@@ -276,34 +289,6 @@
 users to utilize the optimization level they want.
 
 
-Open Issues
-===========
-
-Not specifying the optimization level when it is at 0
------------------------------------------------------
-
-It has been suggested that for the common case of when the
-optimizations are at level 0 that the entire part of the file name
-relating to the optimization level be left out. This would allow for
-file names of ``.pyc`` files to go unchanged, potentially leading to
-less backwards-compatibility issues (although Python 3.5 introduces a
-new magic number for bytecode so all bytecode files will have to be
-regenerated regardless of the outcome of this PEP).
-
-It would also allow a potentially redundant bit of information to be
-left out of the file name if an implementation of Python did not
-allow for optimizing bytecode. This would only occur, though, if the
-interpreter didn't support ``-O`` **and** didn't implement the ast
-module, else users could implement their own optimizations.
-
-Arguments against allowing this special case is "explicit is better
-than implicit" and "special cases aren't special enough to break the
-rules".
-
-At this people have weakly supporting this idea while no one has
-explicitly come out against it.
-
-
 References
 ==========
 
@@ -334,6 +319,9 @@
 .. [9] Informal poll of file name format options on Google+
    (https://plus.google.com/u/0/+BrettCannon/posts/fZynLNwHWGm)
 
+.. [10] The PyPy Project
+   (http://pypy.org/)
+
 
 Copyright
 =========

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list