[Python-ideas] Changes to the existing optimization levels

Sept. 28, 2017

      ...
From what I gather, their use-case is assert statements in production
code. More specifically, they want to be able to optimize away
docstrings, but keep the assert statements, which currently isn't
...
From a command-line perspective, there is already support for
additional optimization levels. That is, without making any changes,
Hi folks:

I was recently looking for an entry-level cpython task to work on in
my spare time and plucked this off of someone's TODO list.

    "Make optimizations more fine-grained than just -O and -OO"

There are currently three supported optimization levels (0, 1, and 2).
Briefly summarized, they do the following.

    0: no optimizations
    1: remove assert statements and __debug__ blocks
    2: remove docstrings, assert statements, and __debug__ blocks

possible with the existing optimization levels.

As a first baby-step, I considered just adding a new optimization
level 3 that keeps asserts but continues to remove docstrings and
__debug__ blocks.

    3: remove docstrings and __debug__ blocks

the optimization level will increase with the number of 0s provided.

    $ python -c "import sys; print(sys.flags.optimize)"
    0

    $ python -OO -c "import sys; print(sys.flags.optimize)"
    2

    $ python -OOOOOOO -c "import sys; print(sys.flags.optimize)"
    7

And the PYTHONOPTIMIZE environment variable will happily assign
something like 42 to sys.flags.optimize.

    $ unset PYTHONOPTIMIZE
    $ python -c "import sys; print(sys.flags.optimize)"
    0

    $ export PYTHONOPTIMIZE=2
    $ python -c "import sys; print(sys.flags.optimize)"
    2

    $ export PYTHONOPTIMIZE=42
    $ python -c "import sys; print(sys.flags.optimize)"
    42

Finally, the resulting __pycache__ folder also already contains the
expected bytecode files for the new optimization levels (
__init__.cpython-37.opt-42.pyc was created for optimization level 42,
for example).

    $ tree
    .
    └── test
        ├── __init__.py
        └── __pycache__
            ├── __init__.cpython-37.opt-1.pyc
            ├── __init__.cpython-37.opt-2.pyc
            ├── __init__.cpython-37.opt-42.pyc
            ├── __init__.cpython-37.opt-7.pyc
            └── __init__.cpython-37.pyc

Adding optimization level 3 is an easy change to make. Here's that
quick proof of concept (minus changes to the docs, etc). I've also
attached that diff as 3.diff.

    https://github.com/dianaclarke/cpython/commit/4bd7278d87bd762b2989178e5bfed3...

I was initially looking for a more elegant solution that allowed you
to specify exactly which optimizations you wanted, and when I floated
this naive ("level 3") approach off-list to a few core developers,
their feedback confirmed my hunch (too hacky).

So for my second pass at this task, I started with the following two
pronged approach.

    1) Changed the various compile signatures to accept a set of
string optimization flags rather than an int value.

    2) Added a new command line option N that allows you to specify
any number of individual optimization flags.

    For example:

        python -N nodebug -N noassert -N nodocstring

The existing optimization options (-O and -OO) still exist in this
approach, but they are mapped to the new optimization flags
("nodebug", "noassert", "nodocstring").

With the exception of the builtin complile() function, all underlying
compile functions would only accept optimization flags going forward,
and the builtin compile() function would accept both an integer
optimize value or a set of optimization flags for backwards
compatibility.

You can find that work-in-progress approach here on github (also
attached as N.diff).

    https://github.com/dianaclarke/cpython/commit/3e36cea1fc8ee6f4cdc584851e4c1e...

All in all, that approach is going fairly well, but there's a lot of
work remaining, and that diff is already getting quite large (for my
new-contributor status).

Note for example, that I haven't yet tackled adding bytecode files to
__pycache__ that reflect these new optimization flags. Something like:

    $ tree
    .
    └── test
        ├── __init__.py
        └── __pycache__
            ├── __init__.cpython-37.opt-nodebug-noassert.pyc
            ├── __init__.cpython-37.opt-nodebug-nodocstring.pyc
            ├── __init__.cpython-37.opt-nodebug-noassert-nodocstring.pyc
            └── __init__.cpython-37.pyc

I'm also not certain if the various compile signatures are even open
for change (int optimize => PyObject *optimizations), or if that's a
no-no.

And there are still a ton of references to "-O", "-OO",
"sys.flags.optimize", "Py_OptimizeFlag", "PYTHONOPTIMIZE", "optimize",
etc that all need to be audited and their implications considered.

I've really enjoyed this task and I'm learning a lot about the c api,
but I think this is a good place to stop and solicit feedback and
direction.

My gut says that the amount of churn and resulting risk is too high to
continue down this path, but I would love to hear thoughts from others
(alternate approaches, ways to limit scope, confirmation that the
existing approach is too entrenched for change, etc).

Regardless, I think the following subset change could merge without
any bigger picture changes, as it just adds test coverage for a case
not yet covered. I can reopen that pull request once I clean up the
commit message a bit (I closed it in the mean time).

    https://github.com/python/cpython/pull/3450/commits/bfdab955a94a7fef431548f3...

Thanks for your time!

Cheers,

--diana

[Python-ideas] Changes to the existing optimization levels

Diana Clarke