The Default for python -X frozen_modules.
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread. Possible solutions: 1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree Thoughts? -eric [1] https://bugs.python.org/issue45020 [2] FWIW, we may end up also freezing the modules imported for "python -m ...", along with some other commonly used modules (like argparse). That is a separate discussion.
On Tue, Sep 28, 2021 at 2:58 AM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
When exactly does the freezing happen? It's only a single data point, but when I'm tinkering with the stdlib itself, I'm always running from the source tree. So option 3 seems quite viable. ChrisA
On Mon, Sep 27, 2021 at 10:51 AM Eric Snow <ericsnowcurrently@gmail.com> wrote:
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
FWIW, I'm planning on doing (2) (and (3) if it isn't complicated). Mostly I wanted to verify my assumptions about the possible annoyance before getting too far. -eric
On Tue, Sep 28, 2021 at 3:21 AM Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Mon, Sep 27, 2021 at 11:09 AM Chris Angelico <rosuav@gmail.com> wrote:
When exactly does the freezing happen?
When you build the executable (e.g. "make -j8", ".\PCbuild\build.bat"). So your changes to those .py files wouldn't show up until then.
Ah, gotcha. Then I would say either #1 or #3 would be entirely acceptable to me, with preference to #3. Hmm. A thought: What happens if I run "make", then edit one of the files, then "make install"? Will it notice that the frozen version is out of date and rebuild it? ChrisA
On 9/27/2021 5:51 PM, Eric Snow wrote:
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise)
Just to air my concerns regarding option 2 (which I've already spoken to Eric about): I feel this option is completely orthogonal to whether PGO is used or not, and ought to be discoverable independently. Essentially, it should be its own configure-time option, and should be included somewhere in sysconfig.get_config_var(...). If we went with #2, there's no reliable way to detect whether profile-guided optimisations were used on all CPython builds, which means there'd be no reliable way to detect whether frozen modules are going to be enabled by default or not. Adding a separate option handles this case. (My overall preference is for #3, FWIW) Cheers, Steve
How about checking each non-frozen module's hash and/or and comparing it to that of the frozen module? Would that defeat the performance improvement of freezing? Is it just a terrible idea? On 27/09/2021 17:51, Eric Snow wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
-eric
On Mon, Sep 27, 2021 at 10:40 AM Steve Dower <steve.dower@python.org> wrote:
On 9/27/2021 5:51 PM, Eric Snow wrote:
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise)
Just to air my concerns regarding option 2 (which I've already spoken to Eric about): I feel this option is completely orthogonal to whether PGO is used or not, and ought to be discoverable independently.
Essentially, it should be its own configure-time option, and should be included somewhere in sysconfig.get_config_var(...).
If we went with #2, there's no reliable way to detect whether profile-guided optimisations were used on all CPython builds, which means there'd be no reliable way to detect whether frozen modules are going to be enabled by default or not. Adding a separate option handles this case.
(My overall preference is for #3, FWIW)
When I proposed #2, I used "PGO" as a proxy for "best optimization mode". On UNIX, this is `./configure --enable-optimizations`, which doesn't mention PGO -- IIUC it turns on PGO and LTO, if they're available. So my *actual* proposal (call it #2') is to use a separate compile-time flag, which is set by `./configure --enable-optimizations` regardless of whether PGO/LTO are possible, and which on Windows can be set by `PCbuild\build.bat --pgo` (we could add another flag to disable it, but who'd want to?). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On 9/27/21 10:50 AM, Guido van Rossum wrote:
So my *actual* proposal (call it #2') is to use a separate compile-time flag, which is set by `./configure --enable-optimizations` regardless of whether PGO/LTO are possible, and which on Windows can be set by `PCbuild\build.bat --pgo` (we could add another flag to disable it, but who'd want to?).
I think a configure-time flag is the way to go, and I'm happy to have it included with --enable-optimizations. -- ~Ethan~
On 9/27/2021 7:15 PM, Ethan Furman wrote:
On 9/27/21 10:50 AM, Guido van Rossum wrote:
So my *actual* proposal (call it #2') is to use a separate compile-time flag, which is set by `./configure --enable-optimizations` regardless of whether PGO/LTO are possible, and which on Windows can be set by `PCbuild\build.bat --pgo` (we could add another flag to disable it, but who'd want to?).
I think a configure-time flag is the way to go, and I'm happy to have it included with --enable-optimizations.
Having it be implied by an "--enable-optimizations" option is totally fine (and we'd add one to build.bat for this), but I still think it needs to be discoverable later whether the frozen modules build option was used or not, independent of other build options. Cheers, Steve
On Mon, Sep 27, 2021 at 12:40 PM Steve Dower <steve.dower@python.org> wrote:
Having it be implied by an "--enable-optimizations" option is totally fine (and we'd add one to build.bat for this), but I still think it needs to be discoverable later whether the frozen modules build option was used or not, independent of other build options.
That's reasonable. -eric
On Mon, Sep 27, 2021 at 9:54 AM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
What about opting out when `--with-pydebug` is used? I'm not sure how many people actively develop in a non-debug build other than testing something, but at that point I would be having to run `make` probably anyway for whatever I'm mucking with if it's *that* influenced by a debug build.
A couple of questions. If you’re planning a runtime -X option, then does that mean that the modules will be frozen at build time but Python will decide at runtime whether to use the frozen modules or the unfrozen ones? Are you planning on including the currently frozen importlib modules in that same mechanism? Will `make test` and/or CI run Python with both options? How will we make sure that frozen modules (or not) don’t break Python? Option #3 seems like the most reasonable one to me, with the ability to turn it on when running from the source tree. -Barry
On Sep 27, 2021, at 09:51, Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
-eric
[1] https://bugs.python.org/issue45020 [2] FWIW, we may end up also freezing the modules imported for "python -m ...", along with some other commonly used modules (like argparse). That is a separate discussion. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4ESW3NNO... Code of Conduct: http://python.org/psf/codeofconduct/
Hi Eric, Which stdlib modules are currently frozen? If I really want to hack site.py or os.py for whatever reason, I just have to use "python3 -X frozen_modules=off"?
1. always default to "on" (the annoyance for contributors isn't big enough?)
What is the annoyance? What is different between frozen and not frozen? Victor On Mon, Sep 27, 2021 at 6:58 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
-eric
[1] https://bugs.python.org/issue45020 [2] FWIW, we may end up also freezing the modules imported for "python -m ...", along with some other commonly used modules (like argparse). That is a separate discussion. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4ESW3NNO... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
On Mon, Sep 27, 2021 at 2:59 PM Brett Cannon <brett@python.org> wrote:
What about opting out when `--with-pydebug` is used? I'm not sure how many people actively develop in a non-debug build other than testing something, but at that point I would be having to run `make` probably anyway for whatever I'm mucking with if it's that influenced by a debug build.
Yeah, that's an option too. -eric
On Mon, Sep 27, 2021 at 3:04 PM Barry Warsaw <barry@python.org> wrote:
If you’re planning a runtime -X option, then does that mean that the modules will be frozen at build time but Python will decide at runtime whether to use the frozen modules or the unfrozen ones?
Correct. FYI, this was already done.
Are you planning on including the currently frozen importlib modules in that same mechanism?
No. They must always be frozen. See is_essential_frozen_module() in Python/import.c.
Will `make test` and/or CI run Python with both options? How will we make sure that frozen modules (or not) don’t break Python?
If "configure --with-optimizations" always sets the default to "on" and the default is "off" otherwise, then the PGO buildbots will exercise the frozen path. Likewise if "--with-pydebug" (or in-source-tree) makes the default "off" and otherwise it's "on". Without a build-time option already handled by one of the buildbots, we'd need to either add a dedicated buildbot or run it both ways (like we do with importlib). I expect that won't be necessary.
Option #3 seems like the most reasonable one to me, with the ability to turn it on when running from the source tree.
It's definitely the one that fits most naturally for me. -eric
On Mon, Sep 27, 2021 at 3:31 PM Victor Stinner <vstinner@python.org> wrote:
Which stdlib modules are currently frozen? If I really want to hack site.py or os.py for whatever reason, I just have to use "python3 -X frozen_modules=off"?
The single-source-of-truth is Tools/scripts/freeze_modules.py. After running "make regen-frozen" you'll find a cleaner list in Python/frozen_modules/MANIFEST. You can also look at the generated code in Makefile.pre.in or Python/frozen.c. Finally, you can run "./python -X frozen_modules=on -c 'import _imp; print(_imp._frozen_module_names())'"
1. always default to "on" (the annoyance for contributors isn't big enough?)
What is the annoyance?
The annoyance of changes to the .py files not getting used (at least not until after running "make all"
What is different between frozen and not frozen?
They have a different loader and repr. Also, frozen modules do not have __file__ set (and __path__ is always []). -eric
On Tue, Sep 28, 2021 at 12:52 AM Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Mon, Sep 27, 2021 at 3:31 PM Victor Stinner <vstinner@python.org> wrote:
Which stdlib modules are currently frozen? If I really want to hack site.py or os.py for whatever reason, I just have to use "python3 -X frozen_modules=off"?
The single-source-of-truth is Tools/scripts/freeze_modules.py. After running "make regen-frozen" you'll find a cleaner list in Python/frozen_modules/MANIFEST. You can also look at the generated code in Makefile.pre.in or Python/frozen.c. Finally, you can run "./python -X frozen_modules=on -c 'import _imp; print(_imp._frozen_module_names())'"
Ok, so compared to Python 3.10, the following 13 stdlib modules can now be frozen: * _collections_abc * _sitebuiltins * abc * codecs * genericpath * io * ntpath * os * os.path * posixpath * site * stat * zipimport (I tested on Linux. I guess that the list is the same on other operating systems.) Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Mon, 27 Sep 2021 10:51:43 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
My vote is on #3 to minimize contributor annoyance and eventual puzzlement. Regards Antoine.
On 27.09.2021 18:51, Eric Snow wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
#3 sounds like a good solution, but how would you detect "running from the source tree" ? This sounds like you need another stat call somewhere, which is what the frozen modules try to avoid. I'd like to suggest adding an environment variable to enable / disable the setting instead. This makes it easy to customize the behavior without introducing complicated logic. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 28 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 28.09.2021 10:22, Marc-Andre Lemburg wrote:
On 27.09.2021 18:51, Eric Snow wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
#3 sounds like a good solution, but how would you detect "running from the source tree" ? This sounds like you need another stat call somewhere, which is what the frozen modules try to avoid.
I'd like to suggest adding an environment variable to enable / disable the setting instead. This makes it easy to customize the behavior without introducing complicated logic.
Just to clarify: the modules would still always be frozen with the env var setting, but Python would simply not import them as frozen modules, but instead go and look on the PYTHONPATH for the modules. This could be achieved by special casing the frozen module finder function to only trigger on importlib modules and return NULL for all other possibly frozen modules. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 28 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 28 Sep 2021, at 10:05, Antoine Pitrou <antoine@python.org> wrote:
On Mon, 27 Sep 2021 10:51:43 -0600 Eric Snow <ericsnowcurrently@gmail.com <mailto:ericsnowcurrently@gmail.com>> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
My vote is on #3 to minimize contributor annoyance and eventual puzzlement.
I agree, but… Most CPython tests are run while running from the source tree, that means that there will have to be testrunner configurations that run with “-X frozen_modules=on”. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On Tue, 28 Sep 2021 10:51:53 +0200 Ronald Oussoren via Python-Dev <python-dev@python.org> wrote:
On 28 Sep 2021, at 10:05, Antoine Pitrou <antoine@python.org> wrote:
On Mon, 27 Sep 2021 10:51:43 -0600 Eric Snow <ericsnowcurrently@gmail.com <mailto:ericsnowcurrently@gmail.com>> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
My vote is on #3 to minimize contributor annoyance and eventual puzzlement.
I agree, but… Most CPython tests are run while running from the source tree, that means that there will have to be testrunner configurations that run with “-X frozen_modules=on”.
Well, multiplying CI configurations is the price of adding options in general. Regards Antoine.
On 28 Sep 2021, at 10:54, Antoine Pitrou <antoine@python.org> wrote:
On Tue, 28 Sep 2021 10:51:53 +0200 Ronald Oussoren via Python-Dev <python-dev@python.org> wrote:
On 28 Sep 2021, at 10:05, Antoine Pitrou <antoine@python.org> wrote:
On Mon, 27 Sep 2021 10:51:43 -0600 Eric Snow <ericsnowcurrently@gmail.com <mailto:ericsnowcurrently@gmail.com>> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
My vote is on #3 to minimize contributor annoyance and eventual puzzlement.
I agree, but… Most CPython tests are run while running from the source tree, that means that there will have to be testrunner configurations that run with “-X frozen_modules=on”.
Well, multiplying CI configurations is the price of adding options in general.
Of course. I mentioned it because the proposal is to add a new option that’s enabled after installation, and basically not when the testsuite is run. That’s not a problem, we could just enable the option in most CI jobs. Ronald
Regards
Antoine.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R6C5XNYB... Code of Conduct: http://python.org/psf/codeofconduct/
— Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On Tue, 2021-09-28 at 10:22 +0200, Marc-Andre Lemburg wrote:
On 27.09.2021 18:51, Eric Snow wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
#3 sounds like a good solution, but how would you detect "running from the source tree" ? This sounds like you need another stat call somewhere, which is what the frozen modules try to avoid.
It does. FYI, here's the sysconfig implementation https://github.com/python/cpython/blob/main/Lib/sysconfig.py#L146-L181 But a more efficient way to do this could be added.
I'd like to suggest adding an environment variable to enable / disable the setting instead. This makes it easy to customize the behavior without introducing complicated logic.
From your followup reply, it seems like you are suggesting that it should be enabled by default, and use a env var to disable it. That would have the same problem regarding the annoyance of contributors. Is there any reason why you would prefer that over #2? That seems like the best option to me if #3 is not feasible. Cheers :) Filipe Laíns
On Mon, Sep 27, 2021 at 6:58 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
Honestly, for me, #1: always on, is the most reasonable choice. I dislike when Python behaves differently depending on subtle things like "was it built with optimizations" or "is Python started from its source tree"? When I built Python without optimization and/or from its source tree, I do that to debug an issue. If the bug goes away in this case, it can waste my time. So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default. -- It means that the site module module can no longer be "customized" by modifying directly the site.py file (inject a path in PYTHONPATH env var where the customized site.py lives). But there is already a supported way to customize the site module: create a module named "sitecustomize" or "usercustomizer". I recall that virtualenv likes to override stdlib site.py with its own code. tox uses virtualenv by default. Someone should check if freezing site doesn't break virtualenv and tox, since they seem to be popular in Python. The venv doesn't need to override site.py and tox can use venv if I recall correctly. If site.py customization is too popular, I would suggest to not freeze this one, until the community stops doing that. Victor -- Night gathers, and now my watch begins. It shall not end until my death.
What is the annoyance? What is different between frozen and not frozen?
One interesting consequence of what Eric mentioned (They have a different loader and repr. Also, frozen modules do not have __file__ set (and __path__ is always []).) is that frozen modules don't have a `__file__` attribute IIRC and therefore tracebacks won't include the source. On Mon, 27 Sept 2021 at 22:31, Victor Stinner <vstinner@python.org> wrote:
Hi Eric,
Which stdlib modules are currently frozen? If I really want to hack site.py or os.py for whatever reason, I just have to use "python3 -X frozen_modules=off"?
1. always default to "on" (the annoyance for contributors isn't big enough?)
What is the annoyance? What is different between frozen and not frozen?
Victor
On Mon, Sep 27, 2021 at 6:58 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big
enough?)
2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
-eric
[1] https://bugs.python.org/issue45020 [2] FWIW, we may end up also freezing the modules imported for "python -m ...", along with some other commonly used modules (like argparse). That is a separate discussion. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4ESW3NNO... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CLODS7B5... Code of Conduct: http://python.org/psf/codeofconduct/
Would it be possible to add a __file__ attribute? Victor On Tue, Sep 28, 2021 at 2:47 PM Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
What is the annoyance? What is different between frozen and not frozen?
One interesting consequence of what Eric mentioned (They have a different loader and repr. Also, frozen modules do not have __file__ set (and __path__ is always []).) is that frozen modules don't have a `__file__` attribute IIRC and therefore tracebacks won't include the source.
On Mon, 27 Sept 2021 at 22:31, Victor Stinner <vstinner@python.org> wrote:
Hi Eric,
Which stdlib modules are currently frozen? If I really want to hack site.py or os.py for whatever reason, I just have to use "python3 -X frozen_modules=off"?
1. always default to "on" (the annoyance for contributors isn't big enough?)
What is the annoyance? What is different between frozen and not frozen?
Victor
On Mon, Sep 27, 2021 at 6:58 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
-eric
[1] https://bugs.python.org/issue45020 [2] FWIW, we may end up also freezing the modules imported for "python -m ...", along with some other commonly used modules (like argparse). That is a separate discussion. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4ESW3NNO... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CLODS7B5... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
On 9/28/2021 8:36 AM, Victor Stinner wrote:
On Mon, Sep 27, 2021 at 6:58 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts? Honestly, for me, #1: always on, is the most reasonable choice.
I dislike when Python behaves differently depending on subtle things like "was it built with optimizations" or "is Python started from its source tree"?
When I built Python without optimization and/or from its source tree, I do that to debug an issue. If the bug goes away in this case, it can waste my time.
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default.
I agree with Victor here: I'd rather have #1. As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning. Eric
On Tue, 28 Sep 2021 08:55:05 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default.
I agree with Victor here: I'd rather have #1.
As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning.
What would be the point of printing a warning instead of doing just what the user is expecting? Freezing the stdlib is a startup performance optimization. It doesn't need to be turned on when hacking on the Python source code... And having to type "-X frozen_modules=off" is much more of a nuisance than losing 10 ms in startup time (which you won't even notice in most cases). Regards Antoine.
On 9/28/2021 9:10 AM, Antoine Pitrou wrote:
On Tue, 28 Sep 2021 08:55:05 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default. I agree with Victor here: I'd rather have #1.
As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning. What would be the point of printing a warning instead of doing just what the user is expecting?
To me, the point would be to get the same behavior no matter which python executable I run, and without regard to where I run it from. Eric
Freezing the stdlib is a startup performance optimization. It doesn't need to be turned on when hacking on the Python source code... And having to type "-X frozen_modules=off" is much more of a nuisance than losing 10 ms in startup time (which you won't even notice in most cases).
Regards
Antoine.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IJEP37TC... Code of Conduct: http://python.org/psf/codeofconduct/
On 28.09.2021 14:26, Filipe Laíns wrote:
On Tue, 2021-09-28 at 10:22 +0200, Marc-Andre Lemburg wrote:
On 27.09.2021 18:51, Eric Snow wrote:
We've frozen most of the stdlib modules imported during "python -c pass" [1][2], to make startup a bit faster. Import of those modules is controlled by "-X frozen_modules=[on|off]". Currently it defaults to "off" but we'd like to default to "on". The blocker is the impact on contributors. I expect many will make changes to a stdlib module and then puzzle over why those changes aren't getting used. That's an annoyance we can avoid, which is the point of this thread.
Possible solutions:
1. always default to "on" (the annoyance for contributors isn't big enough?) 2. default to "on" if it's a PGO build (and "off" otherwise) 3. default to "on" unless running from the source tree
Thoughts?
#3 sounds like a good solution, but how would you detect "running from the source tree" ? This sounds like you need another stat call somewhere, which is what the frozen modules try to avoid.
It does. FYI, here's the sysconfig implementation https://github.com/python/cpython/blob/main/Lib/sysconfig.py#L146-L181
But a more efficient way to do this could be added.
Thanks. So the Setup files are used as landmarks to determine whether Python is from a Python source tree or not.
I'd like to suggest adding an environment variable to enable / disable the setting instead. This makes it easy to customize the behavior without introducing complicated logic.
From your followup reply, it seems like you are suggesting that it should be enabled by default, and use a env var to disable it. That would have the same problem regarding the annoyance of contributors.
Setting an env var in a dev environment is not really all that hard (I always have such environments set up for all the projects I'm working on), so it's only a mild annoyance, while not affecting all other times Python is run by others. The env var would also have the nice side effect of not cluttering up the command line and making it easy to test drive frozen vs. source code based imports.
Is there any reason why you would prefer that over #2? That seems like the best option to me if #3 is not feasible.
PGO optimizations have to be enabled with ./configure and are not enabled per default, so it's rather likely that many Python binaries out there do not have PGO enabled and would then not benefit from the frozen modules -- even though PGO is not required for supporting frozen modules. I don't think combining the two features is a good idea. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 28 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Tue, 28 Sep 2021 09:14:38 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
On 9/28/2021 9:10 AM, Antoine Pitrou wrote:
On Tue, 28 Sep 2021 08:55:05 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default. I agree with Victor here: I'd rather have #1.
As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning. What would be the point of printing a warning instead of doing just what the user is expecting?
To me, the point would be to get the same behavior no matter which python executable I run, and without regard to where I run it from.
But why do you care about this? What does it change *concretely*?
On 9/28/2021 9:17 AM, Antoine Pitrou wrote:
On Tue, 28 Sep 2021 09:14:38 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
On Tue, 28 Sep 2021 08:55:05 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default. I agree with Victor here: I'd rather have #1.
As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning. What would be the point of printing a warning instead of doing just what the user is expecting? To me, the point would be to get the same behavior no matter which
On 9/28/2021 9:10 AM, Antoine Pitrou wrote: python executable I run, and without regard to where I run it from. But why do you care about this? What does it change *concretely*?
It reduces the number of things I have to remember which are different based on where I'm running python (or which executable I'm running). Eric
The Python Debug Build document lists changes compared to a release build: https://docs.python.org/dev/using/configure.html#python-debug-build Sometimes, I'm confused that "./python" (Python built locally in debug mode) displays warnings, whereas "python" (Fedora package) doesn't. See also the Python Development Mode: https://docs.python.org/dev/library/devmode.html#devmode The documentation starts with: "The Python Development Mode introduces additional runtime checks that are too expensive to be enabled by default." Victor On Tue, Sep 28, 2021 at 3:33 PM Eric V. Smith <eric@trueblade.com> wrote:
On 9/28/2021 9:17 AM, Antoine Pitrou wrote:
On Tue, 28 Sep 2021 09:14:38 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
On Tue, 28 Sep 2021 08:55:05 -0400 "Eric V. Smith" <eric@trueblade.com> wrote:
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default. I agree with Victor here: I'd rather have #1.
As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning. What would be the point of printing a warning instead of doing just what the user is expecting? To me, the point would be to get the same behavior no matter which
On 9/28/2021 9:10 AM, Antoine Pitrou wrote: python executable I run, and without regard to where I run it from. But why do you care about this? What does it change *concretely*?
It reduces the number of things I have to remember which are different based on where I'm running python (or which executable I'm running).
Eric
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2FN2OCYG... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
On Tue, Sep 28, 2021 at 2:54 AM Ronald Oussoren via Python-Dev <python-dev@python.org> wrote:
I agree, but… Most CPython tests are run while running from the source tree, that means that there will have to be testrunner configurations that run with “-X frozen_modules=on”.
If the build option that determines the default is covered by existing builtbots then we will be running the test suite in both modes without any extra work. The alternative is that we do for other modules what we do with importlib: run the relevant tests one in each mode. However, it's better to run the whole suite in both modes, so I'd favor relying on the build-option-specific buildbots to get us coverage. -eric
On Tue, Sep 28, 2021 at 6:02 AM Ronald Oussoren via Python-Dev <python-dev@python.org> wrote:
Of course. I mentioned it because the proposal is to add a new option that’s enabled after installation, and basically not when the testsuite is run. That’s not a problem, we could just enable the option in most CI jobs.
FYI, I already added the CLI option (-X frozen_modules=[on|off]) a couple weeks ago, with the default always "off", and have frozen about 10 of the stdlib modules (see _imp._frozen_module_names()). This thread is about a satisfactory approach to changing the default to "on". -eric
On Tue, Sep 28, 2021 at 2:22 AM Marc-Andre Lemburg <mal@egenix.com> wrote:
#3 sounds like a good solution, but how would you detect "running from the source tree" ? This sounds like you need another stat call somewhere, which is what the frozen modules try to avoid.
We already look for the stdlib dir in Modules/getpath.c. We can use that information without an extra stat. (See https://bugs.python.org/issue45211.)
I'd like to suggest adding an environment variable to enable / disable the setting instead. This makes it easy to customize the behavior without introducing complicated logic.
That's essentially what "-X frozen_modules=..." provides, though with an env var you don't have to adjust your CLI invocation each time. That said, there are a couple reasons why an env var might not be suitable. For one, I expect use of the -X option to be very uncommon, especially outside of core development, so more of a one-off feature. In contrast, to me environment variables imply repeated usage. Also, if we use an env var to override the default (of "on"), contributors will still get bitten by the problem I described originally. To me, it's important that the default in that case be "off" without any other intervention. FWIW, I consider the "complicated logic" part as the negative side of going with running-in-source-tree. So, at this point I'm leaning more toward Brett's suggestion of using "configure --with-pydebug" (AKA Py_DEBUG) to determine the default. That should be a suitable approximation of running-in-source-tree. We can circle back if it proves inadequate. On Tue, Sep 28, 2021 at 2:26 AM Marc-Andre Lemburg <mal@egenix.com> wrote:
Just to clarify: the modules would still always be frozen with the env var setting, but Python would simply not import them as frozen modules, but instead go and look on the PYTHONPATH for the modules.
This could be achieved by special casing the frozen module finder function to only trigger on importlib modules and return NULL for all other possibly frozen modules.
Right. That is essentially what we're doing. (See find_frozen() in Python/import.c.) -eric
On Tue, Sep 28, 2021 at 6:36 AM Victor Stinner <vstinner@python.org> wrote:
Honestly, for me, #1: always on, is the most reasonable choice.
I dislike when Python behaves differently depending on subtle things like "was it built with optimizations" or "is Python started from its source tree"?
When I built Python without optimization and/or from its source tree, I do that to debug an issue. If the bug goes away in this case, it can waste my time.
So I prefer to teach everybody how to use "-X frozen_modules=off" if they want to hack the stdlib for their greatest pleasure. I prefer that such special use case requires an opt-in option, the special use case is not special enough to be the default.
Agreed. I just don't want to discourage potential contributors nor waste anyone's time. I suppose that's the fundamental question I originally posted: would it be too annoying for contributors if we made the default "on" always? I expect most non-docs contributions are made against the stdlib so that factors in.
It means that the site module module can no longer be "customized" by modifying directly the site.py file (inject a path in PYTHONPATH env var where the customized site.py lives). But there is already a supported way to customize the site module: create a module named "sitecustomize" or "usercustomizer". I recall that virtualenv likes to override stdlib site.py with its own code. tox uses virtualenv by default. Someone should check if freezing site doesn't break virtualenv and tox, since they seem to be popular in Python. The venv doesn't need to override site.py and tox can use venv if I recall correctly.
If site.py customization is too popular, I would suggest to not freeze this one, until the community stops doing that.
Good point. I'll look into that. -eric
On Tue, Sep 28, 2021 at 6:47 AM Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
One interesting consequence of what Eric mentioned (They have a different loader and repr. Also, frozen modules do not have __file__ set (and __path__ is always []).) is that frozen modules don't have a `__file__` attribute IIRC and therefore tracebacks won't include the source.
FYI, we are planning on setting __file__ on the frozen stdlib modules, whenever possible. (We can do that whenever we can determine the stdlib dir during startup. See https://bugs.python.org/issue45211.) Regardless, for tracebacks we would need to set co_filename on the module's code objects, right? -eric
On Tue, Sep 28, 2021 at 6:55 AM Eric V. Smith <eric@trueblade.com> wrote:
As a compromise, how about go with #1, but print a warning if python detects that it's not built with optimizations or is run from a source tree (the conditions in #2 and #3)? The warning could suggest running with "-X frozen_modules=off". I realize that it will probably be ignored over time, but maybe it will provide enough of a reminder if someone is debugging and sees the warning.
Yeah, that would probably be sufficient (and much simpler). I'll try it out. -eric
On Tue, 28 Sept 2021 at 15:33, Eric Snow <ericsnowcurrently@gmail.com> wrote:
It means that the site module module can no longer be "customized" by modifying directly the site.py file (inject a path in PYTHONPATH env var where the customized site.py lives). But there is already a supported way to customize the site module: create a module named "sitecustomize" or "usercustomizer". I recall that virtualenv likes to override stdlib site.py with its own code. tox uses virtualenv by default. Someone should check if freezing site doesn't break virtualenv and tox, since they seem to be popular in Python. The venv doesn't need to override site.py and tox can use venv if I recall correctly.
If site.py customization is too popular, I would suggest to not freeze this one, until the community stops doing that.
Good point. I'll look into that.
I don't believe virtualenv ships its own site.py these days. That was a historical thing, and was always a pain point, but when virtualenv got rewritten I'm almost certain we stopped doing it. Paul
participants (16)
-
Antoine Pitrou
-
Barry Warsaw
-
Brett Cannon
-
Chris Angelico
-
Eric Snow
-
Eric V. Smith
-
Ethan Furman
-
Filipe Laíns
-
Guido van Rossum
-
Marc-Andre Lemburg
-
Pablo Galindo Salgado
-
Patrick Reader
-
Paul Moore
-
Ronald Oussoren
-
Steve Dower
-
Victor Stinner