Changes to the existing optimization levels

From a command-line perspective, there is already support for additional optimization levels. That is, without making any changes,
Hi folks: I was recently looking for an entry-level cpython task to work on in my spare time and plucked this off of someone's TODO list. "Make optimizations more fine-grained than just -O and -OO" There are currently three supported optimization levels (0, 1, and 2). Briefly summarized, they do the following. 0: no optimizations 1: remove assert statements and __debug__ blocks 2: remove docstrings, assert statements, and __debug__ blocks possible with the existing optimization levels. As a first baby-step, I considered just adding a new optimization level 3 that keeps asserts but continues to remove docstrings and __debug__ blocks. 3: remove docstrings and __debug__ blocks the optimization level will increase with the number of 0s provided. $ python -c "import sys; print(sys.flags.optimize)" 0 $ python -OO -c "import sys; print(sys.flags.optimize)" 2 $ python -OOOOOOO -c "import sys; print(sys.flags.optimize)" 7 And the PYTHONOPTIMIZE environment variable will happily assign something like 42 to sys.flags.optimize. $ unset PYTHONOPTIMIZE $ python -c "import sys; print(sys.flags.optimize)" 0 $ export PYTHONOPTIMIZE=2 $ python -c "import sys; print(sys.flags.optimize)" 2 $ export PYTHONOPTIMIZE=42 $ python -c "import sys; print(sys.flags.optimize)" 42 Finally, the resulting __pycache__ folder also already contains the expected bytecode files for the new optimization levels ( __init__.cpython-37.opt-42.pyc was created for optimization level 42, for example). $ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-1.pyc ├── __init__.cpython-37.opt-2.pyc ├── __init__.cpython-37.opt-42.pyc ├── __init__.cpython-37.opt-7.pyc └── __init__.cpython-37.pyc Adding optimization level 3 is an easy change to make. Here's that quick proof of concept (minus changes to the docs, etc). I've also attached that diff as 3.diff. https://github.com/dianaclarke/cpython/commit/4bd7278d87bd762b2989178e5bfed3... I was initially looking for a more elegant solution that allowed you to specify exactly which optimizations you wanted, and when I floated this naive ("level 3") approach off-list to a few core developers, their feedback confirmed my hunch (too hacky). So for my second pass at this task, I started with the following two pronged approach. 1) Changed the various compile signatures to accept a set of string optimization flags rather than an int value. 2) Added a new command line option N that allows you to specify any number of individual optimization flags. For example: python -N nodebug -N noassert -N nodocstring The existing optimization options (-O and -OO) still exist in this approach, but they are mapped to the new optimization flags ("nodebug", "noassert", "nodocstring"). With the exception of the builtin complile() function, all underlying compile functions would only accept optimization flags going forward, and the builtin compile() function would accept both an integer optimize value or a set of optimization flags for backwards compatibility. You can find that work-in-progress approach here on github (also attached as N.diff). https://github.com/dianaclarke/cpython/commit/3e36cea1fc8ee6f4cdc584851e4c1e... All in all, that approach is going fairly well, but there's a lot of work remaining, and that diff is already getting quite large (for my new-contributor status). Note for example, that I haven't yet tackled adding bytecode files to __pycache__ that reflect these new optimization flags. Something like: $ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-nodebug-noassert.pyc ├── __init__.cpython-37.opt-nodebug-nodocstring.pyc ├── __init__.cpython-37.opt-nodebug-noassert-nodocstring.pyc └── __init__.cpython-37.pyc I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no. And there are still a ton of references to "-O", "-OO", "sys.flags.optimize", "Py_OptimizeFlag", "PYTHONOPTIMIZE", "optimize", etc that all need to be audited and their implications considered. I've really enjoyed this task and I'm learning a lot about the c api, but I think this is a good place to stop and solicit feedback and direction. My gut says that the amount of churn and resulting risk is too high to continue down this path, but I would love to hear thoughts from others (alternate approaches, ways to limit scope, confirmation that the existing approach is too entrenched for change, etc). Regardless, I think the following subset change could merge without any bigger picture changes, as it just adds test coverage for a case not yet covered. I can reopen that pull request once I clean up the commit message a bit (I closed it in the mean time). https://github.com/python/cpython/pull/3450/commits/bfdab955a94a7fef431548f3... Thanks for your time! Cheers, --diana

On Thu, 28 Sep 2017 12:48:15 -0600 Diana Clarke <diana.joan.clarke@gmail.com> wrote:
We could instead reuse the existing -X option, which allows for free-form implementation-specific flags.
You probably want to keep the existing signatures for compatibility: - in C, add new APIs with the new convention - in Python, add a new (optional) function argument for the new convention Regards Antoine.

On 29 September 2017 at 05:02, Antoine Pitrou <solipsis@pitrou.net> wrote:
And declaring named optimisation flags to be implementation dependent is likely a good way to go. The one downside is that it would mean there was no formally interpreter independent way of requesting the "nodebug,nodocstring" configuration, but informal conventions around particular uses of "-X" may be sufficient for that purpose.
This approach should also reduce the overall amount of code churn, since any CPython (or external) code currently passing "optimize=-1" won't need to change at all: that already says "get the optimization settings from the interpreter state", so it will pick up any changes to how that configuration works "for free". That said, we may also want to consider a couple of other options related to changing the meaning of *existing* parameters to these APIs: 1. We have the PyCompilerFlags struct that's currently only used to pass around feature flags for the __future__ module. It could gain a second bitfield for optimisation options 2. We could reinterpret "optimize" as a bitfield instead of a regular integer, special casing the already defined values: - all zero: no optimizations - sign bit set: negative -> use global settings - 0x0001: nodebug+noassert - 0x0002: nodebug+noassert+nodocstrings - 0x0004: nodebug - 0x0008: noassert - 0x0010: nodocstrings The "redefine optimizations as a bitfield" approach seems particularly promising to me - it's a full integer, so even with all negative numbers disallowed and the two low order bits reserved for the legacy combinations, that's still 29 different optimisation flags given 32-bit integers. We currently have 3, so that's room for an 866% increase in the number of defined flags :) The opt-N values in pyc files would be somewhat cryptic-to-humans, but still relatively easy to translate back to readable strings given the bitfield values, and common patterns (like 0x14 -> 20 for nodebug+nodocstrings) would likely become familiar pretty quickly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Oh, I like this idea! I had very briefly considered treating the existing flag as a bitfield, but then promptly forgot to explore that line of thought further. I'll play with that approach next week, see where it takes me, and then report back. Thanks so much for taking the time to think this through with me – much appreciated. Cheers, --diana On Fri, Sep 29, 2017 at 1:33 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

In the mean time, I've re-opened the following pull request that can be merged independent of these changes (it's just additional test coverage). trivial: add test coverage for the __debug__ case (optimization levels) https://github.com/python/cpython/pull/3450 Please let me know if I should create a bpo for it, if the commit message is too long, or if you think I should otherwise change the patch in any way. As always, thanks for your time folks! --diana On Fri, Sep 29, 2017 at 2:24 PM, Diana Clarke <diana.joan.clarke@gmail.com> wrote:

On Fri, 29 Sep 2017 17:33:11 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Not sure about that. PyCompilerFlags describes options that should be common to all implementations (since __future__ is part of the language spec).
Well, this is not really a bitfield, but a bitfield plus some irregular hardcoded values. Therefore I don't think it brings much in the way of discoverability / understandability. That said, perhaps it makes implementation easier on the C side... Regards Antoine.

On 1 October 2017 at 22:19, Antoine Pitrou <solipsis@pitrou.net> wrote:
That's why the 2-field struct for compiler flags was my first idea.
That said, perhaps it makes implementation easier on the C side...
Yep, the fact it would avoid requiring any ABI changes for the C API is the main reason I think redefining the semantics of the existing int parameter is worth considering. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Oct 1, 2017 at 6:19 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I think I'm coming to the same conclusion: using bitwise operations for the optimization levels seems to just boil down to a more cryptic version of the simple "level 3" solution, with public-facing impacts to the pycache and existing interfaces etc that I don't think are worth it in this case. My only other thought at the moment, would be to use the existing -X option to achieve something similar to what I did with the new -N option, but then just quickly map that back to an integer under the hood. That is, "-X opt-nodebug -X opt-noassert" would just become "level 3" internally so that the various interfaces wouldn't have to change. But there are lots of downsides to that solution too: - having to hardcode the various possible combinations of string options to an integer value - inelegant lookups like: if flag is greater than 2 but not 10 or 15, etc - un-zen: yet even more ways to set that integer flag (PYTHONOPTIMIZE, -OOO, "-X opt-nodebug -X opt-noassert") - mixed-bag -X options are less discoverable than just adding a new command line option (like -N or -OOO) - other downsides, I'm sure Hmmm.... stalled again, I think. --diana

On Tue, 3 Oct 2017 09:42:40 -0600 Diana Clarke <diana.joan.clarke@gmail.com> wrote:
- mixed-bag -X options are less discoverable than just adding a new command line option (like -N or -OOO)
For such a feature, I think being less discoverable is not really a problem. I don't think many people use the -O flags currently, and among those that do I'm curious how many really benefit (as opposed to seeing that Python has an "optimization" flag and thinking "great, I'm gonna use that to make my code faster" without ever measuring the difference). Regards Antoine.

On 4 October 2017 at 01:42, Diana Clarke <diana.joan.clarke@gmail.com> wrote:
Sorry, I don't think I was entirely clear as to what my suggestion actually was: * Switch to your suggested "set-of-strings" API at the Python level, with the Python level integer interface retained only for backwards compatibility * Keep the current integer-based *C* optimization API, but redefine the way that value is interpreted, rather than passing Python sets around The Python APIs would then convert the Python level sets to the bitfield representation almost immediately for internal use, but you wouldn't need to mess about with the bitfield yourself when calling the Python APIs. The difference I see relates to the fact that in Python: * sets of strings are easier to work with than integer bitfields * adding a new keyword-only argument to existing APIs is straightforward While in C: * integer bitfields are easier to work with than Python sets of Python strings * supporting a new argument would mean defining a whole new parallel set of APIs Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

You may want to look at my PEP 511 which proposes to add a new "-o" option to specify a list of optimizations: https://www.python.org/dev/peps/pep-0511/#changes The PEP proposes to add a new sys.implementation.optim_tag which is used to generated the .pyc filename. Victor

Yup. I referenced your pep a few times in a previous off-list email, but I omitted that paragraph from this pass because I was using it to bolster the previous "level 3" idea (which didn't fly). """ This simple approach to new optimization levels also appears to be inline with the direction Victor Stinner is going in for PEP 511 - "API for code transformers" [1]. More specifically, in the "Optimizer tag" section [2] of that PEP Victor proposes adding a new -o OPTIM_TAG command line option that defaults to "opt" for the existing optimizations, but would also let you to swap in custom bytecode transformers (like "fat" in his examples [3]). Assuming I understood that correctly ;) os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc [1] https://www.python.org/dev/peps/pep-0511/ [2] https://www.python.org/dev/peps/pep-0511/#optimizer-tag [3] https://www.python.org/dev/peps/pep-0511/#examples """ Thanks for taking the time to respond (you too Antoine). Cheers, --diana On Thu, Sep 28, 2017 at 4:09 PM, Victor Stinner <victor.stinner@gmail.com> wrote:

Perhaps I should be a bit clearer. When I said the "level 3" approach "appears to be inline with the direction Victor Stinner is going in for PEP 511", it was mostly at a superficial level. Meaning: - PEP 511 still appears to use integer (unnamed) optimization levels for alternate transformers (fat 0, 1, and 2). I assumed (perhaps incorrectly) that you could provide a list of transformers ("opt,fat,bar") but that each transformer would still contain a number of different off/on toggles, arbitrarily identified as integer flags like 0, 1, and 2. I should go back and read that PEP again. I don't recall seeing where the 0, 1, and 2 came from in the fat examples. os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc - Secondly, I reviewed PEP 511 when I initially started working on the naive "level 3" approach to make sure what I proposed didn't impede the progress of PEP 511 (or more realistically make my attempt obsolete). Since PEP 511 didn't seem to deviate much from the current integer flags (aside from allowing multiple different named sets of integer flags), I figured that whatever approach PEP 511 took with the existing optimization levels (0, 1, and 2) would presumably also work for a new level 3. I hope that makes sense... If not, let me know & I'll try again tomorrow to be clearer. PS. I think it sounds like I'm now re-advocating for the simple "level 3" approach. I'm not – just trying to explain my earlier thought process. I'm open to all kinds of feedback & suggestions. Thanks again folks! Cheers, --diana

On 9/28/17 2:48 PM, Diana Clarke wrote:
Don't forget that the current "no optimizations" setting actually does peephole optimizations. Are we considering addressing https://bugs.python.org/issue2506 to make a really truly "no optimizations" option? --Ned.

I suppose anything is possible ;) Perhaps I'll try my hand at that next. But no, I'm limiting the scope to the existing toggles only (docstrings, __debug__, assert) for this pass. I am aware of that thread though. I read it a few weeks back when I was initially researching the existing implementation and history. Happy Friday, folks! --diana On Fri, Sep 29, 2017 at 6:47 AM, Ned Batchelder <ned@nedbatchelder.com> wrote:

On Thu, 28 Sep 2017 12:48:15 -0600 Diana Clarke <diana.joan.clarke@gmail.com> wrote:
We could instead reuse the existing -X option, which allows for free-form implementation-specific flags.
You probably want to keep the existing signatures for compatibility: - in C, add new APIs with the new convention - in Python, add a new (optional) function argument for the new convention Regards Antoine.

On 29 September 2017 at 05:02, Antoine Pitrou <solipsis@pitrou.net> wrote:
And declaring named optimisation flags to be implementation dependent is likely a good way to go. The one downside is that it would mean there was no formally interpreter independent way of requesting the "nodebug,nodocstring" configuration, but informal conventions around particular uses of "-X" may be sufficient for that purpose.
This approach should also reduce the overall amount of code churn, since any CPython (or external) code currently passing "optimize=-1" won't need to change at all: that already says "get the optimization settings from the interpreter state", so it will pick up any changes to how that configuration works "for free". That said, we may also want to consider a couple of other options related to changing the meaning of *existing* parameters to these APIs: 1. We have the PyCompilerFlags struct that's currently only used to pass around feature flags for the __future__ module. It could gain a second bitfield for optimisation options 2. We could reinterpret "optimize" as a bitfield instead of a regular integer, special casing the already defined values: - all zero: no optimizations - sign bit set: negative -> use global settings - 0x0001: nodebug+noassert - 0x0002: nodebug+noassert+nodocstrings - 0x0004: nodebug - 0x0008: noassert - 0x0010: nodocstrings The "redefine optimizations as a bitfield" approach seems particularly promising to me - it's a full integer, so even with all negative numbers disallowed and the two low order bits reserved for the legacy combinations, that's still 29 different optimisation flags given 32-bit integers. We currently have 3, so that's room for an 866% increase in the number of defined flags :) The opt-N values in pyc files would be somewhat cryptic-to-humans, but still relatively easy to translate back to readable strings given the bitfield values, and common patterns (like 0x14 -> 20 for nodebug+nodocstrings) would likely become familiar pretty quickly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Oh, I like this idea! I had very briefly considered treating the existing flag as a bitfield, but then promptly forgot to explore that line of thought further. I'll play with that approach next week, see where it takes me, and then report back. Thanks so much for taking the time to think this through with me – much appreciated. Cheers, --diana On Fri, Sep 29, 2017 at 1:33 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

In the mean time, I've re-opened the following pull request that can be merged independent of these changes (it's just additional test coverage). trivial: add test coverage for the __debug__ case (optimization levels) https://github.com/python/cpython/pull/3450 Please let me know if I should create a bpo for it, if the commit message is too long, or if you think I should otherwise change the patch in any way. As always, thanks for your time folks! --diana On Fri, Sep 29, 2017 at 2:24 PM, Diana Clarke <diana.joan.clarke@gmail.com> wrote:

On Fri, 29 Sep 2017 17:33:11 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Not sure about that. PyCompilerFlags describes options that should be common to all implementations (since __future__ is part of the language spec).
Well, this is not really a bitfield, but a bitfield plus some irregular hardcoded values. Therefore I don't think it brings much in the way of discoverability / understandability. That said, perhaps it makes implementation easier on the C side... Regards Antoine.

On 1 October 2017 at 22:19, Antoine Pitrou <solipsis@pitrou.net> wrote:
That's why the 2-field struct for compiler flags was my first idea.
That said, perhaps it makes implementation easier on the C side...
Yep, the fact it would avoid requiring any ABI changes for the C API is the main reason I think redefining the semantics of the existing int parameter is worth considering. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Oct 1, 2017 at 6:19 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I think I'm coming to the same conclusion: using bitwise operations for the optimization levels seems to just boil down to a more cryptic version of the simple "level 3" solution, with public-facing impacts to the pycache and existing interfaces etc that I don't think are worth it in this case. My only other thought at the moment, would be to use the existing -X option to achieve something similar to what I did with the new -N option, but then just quickly map that back to an integer under the hood. That is, "-X opt-nodebug -X opt-noassert" would just become "level 3" internally so that the various interfaces wouldn't have to change. But there are lots of downsides to that solution too: - having to hardcode the various possible combinations of string options to an integer value - inelegant lookups like: if flag is greater than 2 but not 10 or 15, etc - un-zen: yet even more ways to set that integer flag (PYTHONOPTIMIZE, -OOO, "-X opt-nodebug -X opt-noassert") - mixed-bag -X options are less discoverable than just adding a new command line option (like -N or -OOO) - other downsides, I'm sure Hmmm.... stalled again, I think. --diana

On Tue, 3 Oct 2017 09:42:40 -0600 Diana Clarke <diana.joan.clarke@gmail.com> wrote:
- mixed-bag -X options are less discoverable than just adding a new command line option (like -N or -OOO)
For such a feature, I think being less discoverable is not really a problem. I don't think many people use the -O flags currently, and among those that do I'm curious how many really benefit (as opposed to seeing that Python has an "optimization" flag and thinking "great, I'm gonna use that to make my code faster" without ever measuring the difference). Regards Antoine.

On 4 October 2017 at 01:42, Diana Clarke <diana.joan.clarke@gmail.com> wrote:
Sorry, I don't think I was entirely clear as to what my suggestion actually was: * Switch to your suggested "set-of-strings" API at the Python level, with the Python level integer interface retained only for backwards compatibility * Keep the current integer-based *C* optimization API, but redefine the way that value is interpreted, rather than passing Python sets around The Python APIs would then convert the Python level sets to the bitfield representation almost immediately for internal use, but you wouldn't need to mess about with the bitfield yourself when calling the Python APIs. The difference I see relates to the fact that in Python: * sets of strings are easier to work with than integer bitfields * adding a new keyword-only argument to existing APIs is straightforward While in C: * integer bitfields are easier to work with than Python sets of Python strings * supporting a new argument would mean defining a whole new parallel set of APIs Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

You may want to look at my PEP 511 which proposes to add a new "-o" option to specify a list of optimizations: https://www.python.org/dev/peps/pep-0511/#changes The PEP proposes to add a new sys.implementation.optim_tag which is used to generated the .pyc filename. Victor

Yup. I referenced your pep a few times in a previous off-list email, but I omitted that paragraph from this pass because I was using it to bolster the previous "level 3" idea (which didn't fly). """ This simple approach to new optimization levels also appears to be inline with the direction Victor Stinner is going in for PEP 511 - "API for code transformers" [1]. More specifically, in the "Optimizer tag" section [2] of that PEP Victor proposes adding a new -o OPTIM_TAG command line option that defaults to "opt" for the existing optimizations, but would also let you to swap in custom bytecode transformers (like "fat" in his examples [3]). Assuming I understood that correctly ;) os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc [1] https://www.python.org/dev/peps/pep-0511/ [2] https://www.python.org/dev/peps/pep-0511/#optimizer-tag [3] https://www.python.org/dev/peps/pep-0511/#examples """ Thanks for taking the time to respond (you too Antoine). Cheers, --diana On Thu, Sep 28, 2017 at 4:09 PM, Victor Stinner <victor.stinner@gmail.com> wrote:

Perhaps I should be a bit clearer. When I said the "level 3" approach "appears to be inline with the direction Victor Stinner is going in for PEP 511", it was mostly at a superficial level. Meaning: - PEP 511 still appears to use integer (unnamed) optimization levels for alternate transformers (fat 0, 1, and 2). I assumed (perhaps incorrectly) that you could provide a list of transformers ("opt,fat,bar") but that each transformer would still contain a number of different off/on toggles, arbitrarily identified as integer flags like 0, 1, and 2. I should go back and read that PEP again. I don't recall seeing where the 0, 1, and 2 came from in the fat examples. os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc - Secondly, I reviewed PEP 511 when I initially started working on the naive "level 3" approach to make sure what I proposed didn't impede the progress of PEP 511 (or more realistically make my attempt obsolete). Since PEP 511 didn't seem to deviate much from the current integer flags (aside from allowing multiple different named sets of integer flags), I figured that whatever approach PEP 511 took with the existing optimization levels (0, 1, and 2) would presumably also work for a new level 3. I hope that makes sense... If not, let me know & I'll try again tomorrow to be clearer. PS. I think it sounds like I'm now re-advocating for the simple "level 3" approach. I'm not – just trying to explain my earlier thought process. I'm open to all kinds of feedback & suggestions. Thanks again folks! Cheers, --diana

On 9/28/17 2:48 PM, Diana Clarke wrote:
Don't forget that the current "no optimizations" setting actually does peephole optimizations. Are we considering addressing https://bugs.python.org/issue2506 to make a really truly "no optimizations" option? --Ned.

I suppose anything is possible ;) Perhaps I'll try my hand at that next. But no, I'm limiting the scope to the existing toggles only (docstrings, __debug__, assert) for this pass. I am aware of that thread though. I read it a few weeks back when I was initially researching the existing implementation and history. Happy Friday, folks! --diana On Fri, Sep 29, 2017 at 6:47 AM, Ned Batchelder <ned@nedbatchelder.com> wrote:
participants (6)
-
Antoine Pitrou
-
Diana Clarke
-
Ned Batchelder
-
Nick Coghlan
-
Terry Reedy
-
Victor Stinner