I was recently looking for an entry-level cpython task to work on in my spare time and plucked this off of someone's TODO list.
"Make optimizations more fine-grained than just -O and -OO"
There are currently three supported optimization levels (0, 1, and 2). Briefly summarized, they do the following.
0: no optimizations 1: remove assert statements and __debug__ blocks 2: remove docstrings, assert statements, and __debug__ blocks
From what I gather, their use-case is assert statements in production code. More specifically, they want to be able to optimize away docstrings, but keep the assert statements, which currently isn't possible with the existing optimization levels.
As a first baby-step, I considered just adding a new optimization level 3 that keeps asserts but continues to remove docstrings and __debug__ blocks.
3: remove docstrings and __debug__ blocks
From a command-line perspective, there is already support for additional optimization levels. That is, without making any changes, the optimization level will increase with the number of 0s provided.
$ python -c "import sys; print(sys.flags.optimize)" 0 $ python -OO -c "import sys; print(sys.flags.optimize)" 2 $ python -OOOOOOO -c "import sys; print(sys.flags.optimize)" 7
And the PYTHONOPTIMIZE environment variable will happily assign something like 42 to sys.flags.optimize.
$ unset PYTHONOPTIMIZE $ python -c "import sys; print(sys.flags.optimize)" 0 $ export PYTHONOPTIMIZE=2 $ python -c "import sys; print(sys.flags.optimize)" 2 $ export PYTHONOPTIMIZE=42 $ python -c "import sys; print(sys.flags.optimize)" 42
Finally, the resulting __pycache__ folder also already contains the expected bytecode files for the new optimization levels ( __init__.cpython-37.opt-42.pyc was created for optimization level 42, for example).
$ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-1.pyc ├── __init__.cpython-37.opt-2.pyc ├── __init__.cpython-37.opt-42.pyc ├── __init__.cpython-37.opt-7.pyc └── __init__.cpython-37.pyc
Adding optimization level 3 is an easy change to make. Here's that quick proof of concept (minus changes to the docs, etc). I've also attached that diff as 3.diff.
I was initially looking for a more elegant solution that allowed you to specify exactly which optimizations you wanted, and when I floated this naive ("level 3") approach off-list to a few core developers, their feedback confirmed my hunch (too hacky).
So for my second pass at this task, I started with the following two pronged approach.
1) Changed the various compile signatures to accept a set of
string optimization flags rather than an int value.
2) Added a new command line option N that allows you to specify
any number of individual optimization flags.
For example: python -N nodebug -N noassert -N nodocstring
The existing optimization options (-O and -OO) still exist in this approach, but they are mapped to the new optimization flags ("nodebug", "noassert", "nodocstring").
With the exception of the builtin complile() function, all underlying compile functions would only accept optimization flags going forward, and the builtin compile() function would accept both an integer optimize value or a set of optimization flags for backwards compatibility.
You can find that work-in-progress approach here on github (also attached as N.diff).
All in all, that approach is going fairly well, but there's a lot of work remaining, and that diff is already getting quite large (for my new-contributor status).
Note for example, that I haven't yet tackled adding bytecode files to __pycache__ that reflect these new optimization flags. Something like:
$ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-nodebug-noassert.pyc ├── __init__.cpython-37.opt-nodebug-nodocstring.pyc ├── __init__.cpython-37.opt-nodebug-noassert-nodocstring.pyc └── __init__.cpython-37.pyc
I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no.
And there are still a ton of references to "-O", "-OO", "sys.flags.optimize", "Py_OptimizeFlag", "PYTHONOPTIMIZE", "optimize", etc that all need to be audited and their implications considered.
I've really enjoyed this task and I'm learning a lot about the c api, but I think this is a good place to stop and solicit feedback and direction.
My gut says that the amount of churn and resulting risk is too high to continue down this path, but I would love to hear thoughts from others (alternate approaches, ways to limit scope, confirmation that the existing approach is too entrenched for change, etc).
Regardless, I think the following subset change could merge without any bigger picture changes, as it just adds test coverage for a case not yet covered. I can reopen that pull request once I clean up the commit message a bit (I closed it in the mean time).
Thanks for your time!