Hi folks:
I was recently looking for an entry-level cpython task to work on in my spare time and plucked this off of someone's TODO list.
"Make optimizations more fine-grained than just -O and -OO"
There are currently three supported optimization levels (0, 1, and 2). Briefly summarized, they do the following.
0: no optimizations 1: remove assert statements and __debug__ blocks 2: remove docstrings, assert statements, and __debug__ blocks
From what I gather, their use-case is assert statements in production
code. More specifically, they want to be able to optimize away docstrings, but keep the assert statements, which currently isn't possible with the existing optimization levels.
As a first baby-step, I considered just adding a new optimization level 3 that keeps asserts but continues to remove docstrings and __debug__ blocks.
3: remove docstrings and __debug__ blocks
From a command-line perspective, there is already support for
additional optimization levels. That is, without making any changes, the optimization level will increase with the number of 0s provided.
$ python -c "import sys; print(sys.flags.optimize)" 0
$ python -OO -c "import sys; print(sys.flags.optimize)" 2
$ python -OOOOOOO -c "import sys; print(sys.flags.optimize)" 7
And the PYTHONOPTIMIZE environment variable will happily assign something like 42 to sys.flags.optimize.
$ unset PYTHONOPTIMIZE $ python -c "import sys; print(sys.flags.optimize)" 0
$ export PYTHONOPTIMIZE=2 $ python -c "import sys; print(sys.flags.optimize)" 2
$ export PYTHONOPTIMIZE=42 $ python -c "import sys; print(sys.flags.optimize)" 42
Finally, the resulting __pycache__ folder also already contains the expected bytecode files for the new optimization levels ( __init__.cpython-37.opt-42.pyc was created for optimization level 42, for example).
$ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-1.pyc ├── __init__.cpython-37.opt-2.pyc ├── __init__.cpython-37.opt-42.pyc ├── __init__.cpython-37.opt-7.pyc └── __init__.cpython-37.pyc
Adding optimization level 3 is an easy change to make. Here's that quick proof of concept (minus changes to the docs, etc). I've also attached that diff as 3.diff.
https://github.com/dianaclarke/cpython/commit/4bd7278d87bd762b2989178e5bfed3...
I was initially looking for a more elegant solution that allowed you to specify exactly which optimizations you wanted, and when I floated this naive ("level 3") approach off-list to a few core developers, their feedback confirmed my hunch (too hacky).
So for my second pass at this task, I started with the following two pronged approach.
1) Changed the various compile signatures to accept a set of string optimization flags rather than an int value.
2) Added a new command line option N that allows you to specify any number of individual optimization flags.
For example:
python -N nodebug -N noassert -N nodocstring
The existing optimization options (-O and -OO) still exist in this approach, but they are mapped to the new optimization flags ("nodebug", "noassert", "nodocstring").
With the exception of the builtin complile() function, all underlying compile functions would only accept optimization flags going forward, and the builtin compile() function would accept both an integer optimize value or a set of optimization flags for backwards compatibility.
You can find that work-in-progress approach here on github (also attached as N.diff).
https://github.com/dianaclarke/cpython/commit/3e36cea1fc8ee6f4cdc584851e4c1e...
All in all, that approach is going fairly well, but there's a lot of work remaining, and that diff is already getting quite large (for my new-contributor status).
Note for example, that I haven't yet tackled adding bytecode files to __pycache__ that reflect these new optimization flags. Something like:
$ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-nodebug-noassert.pyc ├── __init__.cpython-37.opt-nodebug-nodocstring.pyc ├── __init__.cpython-37.opt-nodebug-noassert-nodocstring.pyc └── __init__.cpython-37.pyc
I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no.
And there are still a ton of references to "-O", "-OO", "sys.flags.optimize", "Py_OptimizeFlag", "PYTHONOPTIMIZE", "optimize", etc that all need to be audited and their implications considered.
I've really enjoyed this task and I'm learning a lot about the c api, but I think this is a good place to stop and solicit feedback and direction.
My gut says that the amount of churn and resulting risk is too high to continue down this path, but I would love to hear thoughts from others (alternate approaches, ways to limit scope, confirmation that the existing approach is too entrenched for change, etc).
Regardless, I think the following subset change could merge without any bigger picture changes, as it just adds test coverage for a case not yet covered. I can reopen that pull request once I clean up the commit message a bit (I closed it in the mean time).
https://github.com/python/cpython/pull/3450/commits/bfdab955a94a7fef431548f3...
Thanks for your time!
Cheers,
--diana
On Thu, 28 Sep 2017 12:48:15 -0600 Diana Clarke diana.joan.clarke@gmail.com wrote:
2) Added a new command line option N that allows you to specify
any number of individual optimization flags.
For example: python -N nodebug -N noassert -N nodocstring
We could instead reuse the existing -X option, which allows for free-form implementation-specific flags.
I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no.
You probably want to keep the existing signatures for compatibility: - in C, add new APIs with the new convention - in Python, add a new (optional) function argument for the new convention
Regards
Antoine.
On 29 September 2017 at 05:02, Antoine Pitrou solipsis@pitrou.net wrote:
On Thu, 28 Sep 2017 12:48:15 -0600 Diana Clarke diana.joan.clarke@gmail.com wrote:
2) Added a new command line option N that allows you to specify
any number of individual optimization flags.
For example: python -N nodebug -N noassert -N nodocstring
We could instead reuse the existing -X option, which allows for free-form implementation-specific flags.
And declaring named optimisation flags to be implementation dependent is likely a good way to go.
The one downside is that it would mean there was no formally interpreter independent way of requesting the "nodebug,nodocstring" configuration, but informal conventions around particular uses of "-X" may be sufficient for that purpose.
I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no.
You probably want to keep the existing signatures for compatibility:
- in C, add new APIs with the new convention
- in Python, add a new (optional) function argument for the new convention
This approach should also reduce the overall amount of code churn, since any CPython (or external) code currently passing "optimize=-1" won't need to change at all: that already says "get the optimization settings from the interpreter state", so it will pick up any changes to how that configuration works "for free".
That said, we may also want to consider a couple of other options related to changing the meaning of *existing* parameters to these APIs:
1. We have the PyCompilerFlags struct that's currently only used to pass around feature flags for the __future__ module. It could gain a second bitfield for optimisation options 2. We could reinterpret "optimize" as a bitfield instead of a regular integer, special casing the already defined values:
- all zero: no optimizations - sign bit set: negative -> use global settings - 0x0001: nodebug+noassert - 0x0002: nodebug+noassert+nodocstrings - 0x0004: nodebug - 0x0008: noassert - 0x0010: nodocstrings
The "redefine optimizations as a bitfield" approach seems particularly promising to me - it's a full integer, so even with all negative numbers disallowed and the two low order bits reserved for the legacy combinations, that's still 29 different optimisation flags given 32-bit integers. We currently have 3, so that's room for an 866% increase in the number of defined flags :)
The opt-N values in pyc files would be somewhat cryptic-to-humans, but still relatively easy to translate back to readable strings given the bitfield values, and common patterns (like 0x14 -> 20 for nodebug+nodocstrings) would likely become familiar pretty quickly.
Cheers, Nick.
Oh, I like this idea!
I had very briefly considered treating the existing flag as a bitfield, but then promptly forgot to explore that line of thought further.
I'll play with that approach next week, see where it takes me, and then report back.
Thanks so much for taking the time to think this through with me – much appreciated.
Cheers,
--diana
On Fri, Sep 29, 2017 at 1:33 AM, Nick Coghlan ncoghlan@gmail.com wrote:
- We could reinterpret "optimize" as a bitfield instead of a regular
integer, special casing the already defined values:
- all zero: no optimizations
- sign bit set: negative -> use global settings
- 0x0001: nodebug+noassert
- 0x0002: nodebug+noassert+nodocstrings
- 0x0004: nodebug
- 0x0008: noassert
- 0x0010: nodocstrings
The "redefine optimizations as a bitfield" approach seems particularly promising to me - it's a full integer, so even with all negative numbers disallowed and the two low order bits reserved for the legacy combinations, that's still 29 different optimisation flags given 32-bit integers. We currently have 3, so that's room for an 866% increase in the number of defined flags :)
In the mean time, I've re-opened the following pull request that can be merged independent of these changes (it's just additional test coverage).
trivial: add test coverage for the __debug__ case (optimization levels) https://github.com/python/cpython/pull/3450
Please let me know if I should create a bpo for it, if the commit message is too long, or if you think I should otherwise change the patch in any way.
As always, thanks for your time folks!
--diana
On Fri, Sep 29, 2017 at 2:24 PM, Diana Clarke diana.joan.clarke@gmail.com wrote:
Oh, I like this idea!
I had very briefly considered treating the existing flag as a bitfield, but then promptly forgot to explore that line of thought further.
I'll play with that approach next week, see where it takes me, and then report back.
Thanks so much for taking the time to think this through with me – much appreciated.
Cheers,
--diana
On Fri, Sep 29, 2017 at 1:33 AM, Nick Coghlan ncoghlan@gmail.com wrote:
- We could reinterpret "optimize" as a bitfield instead of a regular
integer, special casing the already defined values:
- all zero: no optimizations
- sign bit set: negative -> use global settings
- 0x0001: nodebug+noassert
- 0x0002: nodebug+noassert+nodocstrings
- 0x0004: nodebug
- 0x0008: noassert
- 0x0010: nodocstrings
The "redefine optimizations as a bitfield" approach seems particularly promising to me - it's a full integer, so even with all negative numbers disallowed and the two low order bits reserved for the legacy combinations, that's still 29 different optimisation flags given 32-bit integers. We currently have 3, so that's room for an 866% increase in the number of defined flags :)
On 9/30/2017 4:36 PM, Diana Clarke wrote:
In the mean time, I've re-opened the following pull request that can be merged independent of these changes (it's just additional test coverage).
trivial: add test coverage for the __debug__ case (optimization levels) https://github.com/python/cpython/pull/3450
Please let me know if I should create a bpo for it, if the commit message is too long, or if you think I should otherwise change the patch in any way.
Your patch is substantial, well beyond trivial. Please open an issue with the page-long description as the first message, add a news blurb, and create a much shorter commit message.
On Fri, 29 Sep 2017 17:33:11 +1000 Nick Coghlan ncoghlan@gmail.com wrote:
That said, we may also want to consider a couple of other options related to changing the meaning of *existing* parameters to these APIs:
- We have the PyCompilerFlags struct that's currently only used to
pass around feature flags for the __future__ module. It could gain a second bitfield for optimisation options
Not sure about that. PyCompilerFlags describes options that should be common to all implementations (since __future__ is part of the language spec).
- We could reinterpret "optimize" as a bitfield instead of a regular
integer, special casing the already defined values:
- all zero: no optimizations
- sign bit set: negative -> use global settings
- 0x0001: nodebug+noassert
- 0x0002: nodebug+noassert+nodocstrings
- 0x0004: nodebug
- 0x0008: noassert
- 0x0010: nodocstrings
Well, this is not really a bitfield, but a bitfield plus some irregular hardcoded values. Therefore I don't think it brings much in the way of discoverability / understandability.
That said, perhaps it makes implementation easier on the C side...
Regards
Antoine.
On 1 October 2017 at 22:19, Antoine Pitrou solipsis@pitrou.net wrote:
- We could reinterpret "optimize" as a bitfield instead of a regular
integer, special casing the already defined values:
- all zero: no optimizations
- sign bit set: negative -> use global settings
- 0x0001: nodebug+noassert
- 0x0002: nodebug+noassert+nodocstrings
- 0x0004: nodebug
- 0x0008: noassert
- 0x0010: nodocstrings
Well, this is not really a bitfield, but a bitfield plus some irregular hardcoded values. Therefore I don't think it brings much in the way of discoverability / understandability.
That's why the 2-field struct for compiler flags was my first idea.
That said, perhaps it makes implementation easier on the C side...
Yep, the fact it would avoid requiring any ABI changes for the C API is the main reason I think redefining the semantics of the existing int parameter is worth considering.
Cheers, Nick.
On Sun, Oct 1, 2017 at 6:19 AM, Antoine Pitrou solipsis@pitrou.net wrote:
Well, this is not really a bitfield, but a bitfield plus some irregular hardcoded values. Therefore I don't think it brings much in the way of discoverability / understandability.
That said, perhaps it makes implementation easier on the C side...
I think I'm coming to the same conclusion: using bitwise operations for the optimization levels seems to just boil down to a more cryptic version of the simple "level 3" solution, with public-facing impacts to the pycache and existing interfaces etc that I don't think are worth it in this case.
My only other thought at the moment, would be to use the existing -X option to achieve something similar to what I did with the new -N option, but then just quickly map that back to an integer under the hood. That is, "-X opt-nodebug -X opt-noassert" would just become "level 3" internally so that the various interfaces wouldn't have to change.
But there are lots of downsides to that solution too:
- having to hardcode the various possible combinations of string options to an integer value - inelegant lookups like: if flag is greater than 2 but not 10 or 15, etc - un-zen: yet even more ways to set that integer flag (PYTHONOPTIMIZE, -OOO, "-X opt-nodebug -X opt-noassert") - mixed-bag -X options are less discoverable than just adding a new command line option (like -N or -OOO) - other downsides, I'm sure
Hmmm.... stalled again, I think.
--diana
On Tue, 3 Oct 2017 09:42:40 -0600 Diana Clarke diana.joan.clarke@gmail.com wrote:
- mixed-bag -X options are less discoverable than just adding a
new command line option (like -N or -OOO)
For such a feature, I think being less discoverable is not really a problem. I don't think many people use the -O flags currently, and among those that do I'm curious how many really benefit (as opposed to seeing that Python has an "optimization" flag and thinking "great, I'm gonna use that to make my code faster" without ever measuring the difference).
Regards
Antoine.
On 4 October 2017 at 01:42, Diana Clarke diana.joan.clarke@gmail.com wrote:
On Sun, Oct 1, 2017 at 6:19 AM, Antoine Pitrou solipsis@pitrou.net wrote:
Well, this is not really a bitfield, but a bitfield plus some irregular hardcoded values. Therefore I don't think it brings much in the way of discoverability / understandability.
That said, perhaps it makes implementation easier on the C side...
I think I'm coming to the same conclusion: using bitwise operations for the optimization levels seems to just boil down to a more cryptic version of the simple "level 3" solution, with public-facing impacts to the pycache and existing interfaces etc that I don't think are worth it in this case.
My only other thought at the moment, would be to use the existing -X option to achieve something similar to what I did with the new -N option, but then just quickly map that back to an integer under the hood. That is, "-X opt-nodebug -X opt-noassert" would just become "level 3" internally so that the various interfaces wouldn't have to change.
Sorry, I don't think I was entirely clear as to what my suggestion actually was:
* Switch to your suggested "set-of-strings" API at the Python level, with the Python level integer interface retained only for backwards compatibility * Keep the current integer-based *C* optimization API, but redefine the way that value is interpreted, rather than passing Python sets around
The Python APIs would then convert the Python level sets to the bitfield representation almost immediately for internal use, but you wouldn't need to mess about with the bitfield yourself when calling the Python APIs.
The difference I see relates to the fact that in Python:
* sets of strings are easier to work with than integer bitfields * adding a new keyword-only argument to existing APIs is straightforward
While in C:
* integer bitfields are easier to work with than Python sets of Python strings * supporting a new argument would mean defining a whole new parallel set of APIs
Cheers, Nick.
Thanks, Nick!
I'll let this sink in today and give it a shot tomorrow.
Have a great weekend,
--diana
- Switch to your suggested "set-of-strings" API at the Python level,
with the Python level integer interface retained only for backwards compatibility
- Keep the current integer-based *C* optimization API, but redefine
the way that value is interpreted, rather than passing Python sets around
The Python APIs would then convert the Python level sets to the bitfield representation almost immediately for internal use, but you wouldn't need to mess about with the bitfield yourself when calling the Python APIs.
The difference I see relates to the fact that in Python:
- sets of strings are easier to work with than integer bitfields
- adding a new keyword-only argument to existing APIs is straightforward
While in C:
- integer bitfields are easier to work with than Python sets of Python strings
- supporting a new argument would mean defining a whole new parallel set of APIs
2) Added a new command line option N that allows you to specify
any number of individual optimization flags.
For example: python -N nodebug -N noassert -N nodocstring
You may want to look at my PEP 511 which proposes to add a new "-o" option to specify a list of optimizations: https://www.python.org/dev/peps/pep-0511/#changes
The PEP proposes to add a new sys.implementation.optim_tag which is used to generated the .pyc filename.
Victor
Yup. I referenced your pep a few times in a previous off-list email, but I omitted that paragraph from this pass because I was using it to bolster the previous "level 3" idea (which didn't fly).
""" This simple approach to new optimization levels also appears to be inline with the direction Victor Stinner is going in for PEP 511 - "API for code transformers" [1]. More specifically, in the "Optimizer tag" section [2] of that PEP Victor proposes adding a new -o OPTIM_TAG command line option that defaults to "opt" for the existing optimizations, but would also let you to swap in custom bytecode transformers (like "fat" in his examples [3]). Assuming I understood that correctly ;)
os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc
[1] https://www.python.org/dev/peps/pep-0511/ [2] https://www.python.org/dev/peps/pep-0511/#optimizer-tag [3] https://www.python.org/dev/peps/pep-0511/#examples """
Thanks for taking the time to respond (you too Antoine).
Cheers,
--diana
On Thu, Sep 28, 2017 at 4:09 PM, Victor Stinner victor.stinner@gmail.com wrote:
2) Added a new command line option N that allows you to specify
any number of individual optimization flags.
For example: python -N nodebug -N noassert -N nodocstring
You may want to look at my PEP 511 which proposes to add a new "-o" option to specify a list of optimizations: https://www.python.org/dev/peps/pep-0511/#changes
The PEP proposes to add a new sys.implementation.optim_tag which is used to generated the .pyc filename.
Victor
Perhaps I should be a bit clearer.
When I said the "level 3" approach "appears to be inline with the direction Victor Stinner is going in for PEP 511", it was mostly at a superficial level. Meaning:
- PEP 511 still appears to use integer (unnamed) optimization levels for alternate transformers (fat 0, 1, and 2). I assumed (perhaps incorrectly) that you could provide a list of transformers ("opt,fat,bar") but that each transformer would still contain a number of different off/on toggles, arbitrarily identified as integer flags like 0, 1, and 2. I should go back and read that PEP again. I don't recall seeing where the 0, 1, and 2 came from in the fat examples.
os.cpython-36.fat-0.pyc os.cpython-36.fat-1.pyc os.cpython-36.fat-2.pyc
- Secondly, I reviewed PEP 511 when I initially started working on the naive "level 3" approach to make sure what I proposed didn't impede the progress of PEP 511 (or more realistically make my attempt obsolete). Since PEP 511 didn't seem to deviate much from the current integer flags (aside from allowing multiple different named sets of integer flags), I figured that whatever approach PEP 511 took with the existing optimization levels (0, 1, and 2) would presumably also work for a new level 3.
I hope that makes sense... If not, let me know & I'll try again tomorrow to be clearer.
PS. I think it sounds like I'm now re-advocating for the simple "level 3" approach. I'm not – just trying to explain my earlier thought process. I'm open to all kinds of feedback & suggestions.
Thanks again folks!
Cheers,
--diana
On 9/28/17 2:48 PM, Diana Clarke wrote:
Hi folks:
I was recently looking for an entry-level cpython task to work on in my spare time and plucked this off of someone's TODO list.
"Make optimizations more fine-grained than just -O and -OO"
There are currently three supported optimization levels (0, 1, and 2). Briefly summarized, they do the following.
0: no optimizations 1: remove assert statements and __debug__ blocks 2: remove docstrings, assert statements, and __debug__ blocks
Don't forget that the current "no optimizations" setting actually does peephole optimizations. Are we considering addressing https://bugs.python.org/issue2506 to make a really truly "no optimizations" option?
--Ned.
I suppose anything is possible ;) Perhaps I'll try my hand at that next.
But no, I'm limiting the scope to the existing toggles only (docstrings, __debug__, assert) for this pass.
I am aware of that thread though. I read it a few weeks back when I was initially researching the existing implementation and history.
Happy Friday, folks!
--diana
On Fri, Sep 29, 2017 at 6:47 AM, Ned Batchelder ned@nedbatchelder.com wrote:
Don't forget that the current "no optimizations" setting actually does peephole optimizations. Are we considering addressing https://bugs.python.org/issue2506 to make a really truly "no optimizations" option?