SC feedback: PEP 648 -- Extensible customizations of the interpreter at startup
Hello Mario, Thank you for your submission of PEP 648 (Extensible customizations of the interpreter at startup). The Python Steering Council has reviewed the PEP and before we can pronounce on it, we have some additional questions and comments we’d like you to address. Once these questions are settled, we are requesting that you post the PEP to python-dev for another round of comments. In general, the SC is in favor of deprecating the executable hack capabilities of pth files, and this PEP is currently the most concrete proposal to get there. We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP. General PEP feedback The introduction of the section titled “Benefits of __sitecustomize__” seems out of place. It forward references the solution without explanation. It also says “The use of a __sitecustomize__ will allow…”; the question is, a __sitecustomize__ what? Directory? File? Perhaps it should be moved to later in the PEP after the semantics of __sitecustomize__ is defined? Some of the terminology used in the PEP could be clarified. For example, I suggest using the term “directory” instead of “folder” throughout the PEP, and use the term “module” or “file” instead of “script” to describe the things inside __sitecustomize__ directories that Python imports, since “scripts” typically describe standalone files that implement applications. For example, must the files be named with .py extension (or presumably .pyc, .pyo) under the normal Python module import rules? (Later in the PEP you do allude to the requirement that the file name must end in .py, but it would be much better to explain this right up front, in a formal specification.) Another ambiguity is what the PEP means by “executing” said “scripts”. Does that mean importing them? Reading them then exec()’ing them? Something else? This section should be clear that Python will import the modules it found, if that’s indeed what the PEP is proposing. The “Backward compatibility” section has this incomplete sentence: “Ignoring those lines in pth files.” There’s this odd grammar in the “Do nothing” section: “After analysing the impact of this change, we believe it is worth given the enhanced experience it brings.” Perhaps it should read “...it is worth it, given the…”? Definition of “site path” The PEP should precisely define what it means when it says “...a folder named __sitecustomize__ located in any site path”. What exactly is a “site path”? Is this any directory on sys.path? Is it a directory named site-packages? This phrase is also ambiguous: “As the folder will be within sys.path, given that it is located in site paths...”. “sys.path” isn’t a thing that can hold a folder, since it’s just a list of strings, and it also can get extended by any number of means, e.g. by setting the PYTHONPATH environment variable at interpreter startup time. So for example, how does PYTHONPATH affect the algorithm? What is the rationale for adding multiple __sitecustomize__ directories, rather than a single directory (or possibly two, one in Python’s system location and one in the user location)? This would simplify the discovery process, and tools like pip can warn the user if a name collision were to occur. I think the PEP needs to be much more crisp and precise in defining the when and how __sitecustomize__ directories are discovered at start up time. Order of Execution I think this section should outline the entire startup execution order. Exactly when do pth files still get evaluated in relationship to __sitecustomize__ discovery? How do pth file sys.path extensions affect the search for __sitecustomize__ directories? This section also says files in __sitecustomize__ will be executed in “file name sorted order”, but the PEP is unclear whether all such files found in every __sitecustomize__ directory will be sorted together, or only within a single __sitecustomize__ directory. Presumably file system encodings are used to determine file name sort order, but I think the PEP should be explicit about this. I’d like to see the execution order for the entire proposed new Python startup sequence as a recipe, or perhaps even some Python pseudo-code. Impact on startup time We still think that startup performance can be a concern. This section needs more data. For example, while the executable part of pth files is being deprecated, both pth and __sitecustomize__ modules will exist. How does this affect startup time in a world where both have to be discovered and processed? The PEP says: “If the user has custom scripts, we think that the impact on the performance of walking each of the folders is acceptable, as the user wants to use this feature.” Does the user really want to use this feature? Won’t it more likely be the system administrator or Python distribution and packaging ecosystem that chooses to use the __sitecustomize__ feature? Maybe the user won’t even know that it’s being used by the packages being installed? The PEP says: ““Running "./python -c pass" with perf on 50 iterations, repeating 50 times the command on each and getting the geometric mean on a commodity laptop did not reveal any substantial raise on CPU time.” We request more information on the experiments performed, and specific numbers so that the startup impact can be reproduced and confirmed by others. E.g. did you run this experiment on a branch of CPython 3.10 with the PEP 648 reference implementation? Did you convert existing uses of executable pth files to __sitecustomize__ modules? How many such modules did you try it with? How many __sitecustomize__ directories were in your tests? Other questions and issues How does this feature interact with virtual environments? What do packaging tools have to do to support this PEP? The PEP mentions setuptools, but what about other packaging tools in common use? The PEP says “This impact will be reduced in the future as we will remove two other imports: "sitecustomize.py" and "usercustomize.py”.” Why not explicitly deprecate these and executable pth files in PEP 648 and begin the countdown at the same time __sitecustomize__ is added to Python? The PEP says: “To facilitate debugging of the Python startup, a new option will be added to the main of the site module to list all scripts that will be executed as part of the __sitecustomize__ initialization.” The PEP should be explicit about what command line switch is being proposed. We think this should probably be an -X option. Do you have any concerns that this feature will be overused? Does separating the execution feature from the path extension feature make it more or less likely to be used by third party library authors? Will the runtime have any access to the list of __sitecustomize__ modules being executed? I.e. will you collect their paths in a new sys module attribute? Will there be any functions to import or list __sitecustomize__ directories or modules after start has completed? We think io.open_code() should be used so that reading the files can be audited. Conclusion The Python Steering Council again wants to thank you for your PEP contribution! We hope you find this feedback constructive and helpful. We look forward to your responses, and an updated PEP. Cheers, -Barry (on behalf of the Python Steering Council)
On 30/03/2021 19.01, Barry Warsaw wrote:
Hello Mario,
Thank you for your submission of PEP 648 (Extensible customizations of the interpreter at startup). The Python Steering Council has reviewed the PEP and before we can pronounce on it, we have some additional questions and comments we’d like you to address. Once these questions are settled, we are requesting that you post the PEP to python-dev for another round of comments.
Hi Mario, could you please include a security analysis of the feature, too? I would like to avoid new ways to exploit Python. In particular I don't think that -S (no site module) is the right way to disable __sitecustomize__. It disables too much useful features. It might be a good idea to disable __sitecustomize__ with -I (isolated mode). There should be a new audit event, too. Christian
Great points Christian, thanks. -Barry
On Mar 30, 2021, at 10:59, Christian Heimes <christian@python.org> wrote:
On 30/03/2021 19.01, Barry Warsaw wrote:
Hello Mario,
Thank you for your submission of PEP 648 (Extensible customizations of the interpreter at startup). The Python Steering Council has reviewed the PEP and before we can pronounce on it, we have some additional questions and comments we’d like you to address. Once these questions are settled, we are requesting that you post the PEP to python-dev for another round of comments.
Hi Mario,
could you please include a security analysis of the feature, too? I would like to avoid new ways to exploit Python.
In particular I don't think that -S (no site module) is the right way to disable __sitecustomize__. It disables too much useful features. It might be a good idea to disable __sitecustomize__ with -I (isolated mode).
There should be a new audit event, too.
Christian
On Wed, 31 Mar 2021, 3:15 am Barry Warsaw, <barry@python.org> wrote:
. We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP.
Please don't, since that would force everyone to start using PEP 648 just to extend sys.path, which would be just as bad as the status quo. Cheers, Nick.
Hi Nick, Please don't, since that would force everyone to start using PEP 648 just
to extend sys.path, which would be just as bad as the status quo.
I think Barry is referring to deprecate the execution capabilities of pth files (https://bugs.python.org/issue33944), not the files themselves. Cheers, Pablo Galindo Salgado On Wed, 31 Mar 2021 at 00:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, 31 Mar 2021, 3:15 am Barry Warsaw, <barry@python.org> wrote:
. We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP.
Please don't, since that would force everyone to start using PEP 648 just to extend sys.path, which would be just as bad as the status quo.
Cheers, Nick.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MSOV7NKD... Code of Conduct: http://python.org/psf/codeofconduct/
Kind of :) PEP 648 would definitely allow us to deprecate the executable part of pth files. I let my own biases leak in to my response because I would like to find a way to replace the sys.path feature of pth with something much more auditable and discoverable. To me that means deprecating pth files and finding something better, but maybe not. In any case, this is outside the scope of PEP 648 so just pretend that part wasn’t in my response. -Barry
On Mar 30, 2021, at 17:00, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Hi Nick,
Please don't, since that would force everyone to start using PEP 648 just to extend sys.path, which would be just as bad as the status quo.
I think Barry is referring to deprecate the execution capabilities of pth files (https://bugs.python.org/issue33944), not the files themselves.
Cheers, Pablo Galindo Salgado
On Wed, 31 Mar 2021 at 00:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, 31 Mar 2021, 3:15 am Barry Warsaw, <barry@python.org> wrote: . We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP.
Please don't, since that would force everyone to start using PEP 648 just to extend sys.path, which would be just as bad as the status quo.
Cheers, Nick.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MSOV7NKD... Code of Conduct: http://python.org/psf/codeofconduct/
Thanks all. I'll address this feedback next week. Regards, Mario On Wed, 31 Mar 2021 at 03:01, Barry Warsaw <barry@python.org> wrote:
Kind of :)
PEP 648 would definitely allow us to deprecate the executable part of pth files. I let my own biases leak in to my response because I would like to find a way to replace the sys.path feature of pth with something much more auditable and discoverable. To me that means deprecating pth files and finding something better, but maybe not.
In any case, this is outside the scope of PEP 648 so just pretend that part wasn’t in my response.
-Barry
On Mar 30, 2021, at 17:00, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Hi Nick,
Please don't, since that would force everyone to start using PEP 648 just to extend sys.path, which would be just as bad as the status quo.
I think Barry is referring to deprecate the execution capabilities of pth files (https://bugs.python.org/issue33944), not the files themselves.
Cheers, Pablo Galindo Salgado
On Wed, 31 Mar 2021 at 00:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, 31 Mar 2021, 3:15 am Barry Warsaw, <barry@python.org> wrote: . We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP.
Please don't, since that would force everyone to start using PEP 648 just to extend sys.path, which would be just as bad as the status quo.
Cheers, Nick.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MSOV7NKD... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, 31 Mar 2021 at 11:01, Barry Warsaw <barry@python.org> wrote:
Kind of :)
PEP 648 would definitely allow us to deprecate the executable part of pth files. I let my own biases leak in to my response because I would like to find a way to replace the sys.path feature of pth with something much more auditable and discoverable. To me that means deprecating pth files and finding something better, but maybe not.
Adding pth file auditing to the output of "python -m site" should be entirely feasible, it just hasn't been done yet. Even if it just listed the files found, it would make them easier to audit than they are today. Declaring the feature impossible to audit when we haven't even really tried to make it auditable seems premature (the existing site output doesn't even indicate which paths in sys.path will be considered when looking for pth files, let alone indicate which of those directories actually contain any). Cheers, Nick.
Hello All, I've pushed a change to rework a bit the pep wording and add further details on its working. It needed indeed some extra work. PR still in flight: https://github.com/python/peps/pull/1941 I've also changed it to target 3.11. I believe most of your concerns should be answered in the PEP now, but I will answer them here as well. Let me know if there is any additional question or any section you think can benefit from additional work.
Thank you for your submission of PEP 648
Thank you for reviewing and considering it :).
We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP.
Agree on deprecating only code execution as Nick pointed out. This PEP includes that in the Backward compatibility a mention to adding a warning already to code execution in pth files.
The introduction of the section titled “Benefits of __sitecustomize__” seems out of place.
Agree, I've restructured it to make things (IMHO) clearer. Happy to apply further changes if you think it is still confusing.
Some of the terminology used in the PEP could be clarified.
Changes applied.
Another ambiguity is what the PEP means by “executing” said “scripts”.
Read + exec. It used to be import but there seemed to be a general preference on read+exec on the discourse thread. (Info now included in the PEP).
The “Backward compatibility” section has this incomplete sentence
Fixed
There’s this odd grammar in the “Do nothing” section
Thanks for pointing it out, updated with your suggestion. Please let me know if you see anything else like that, I'm not an English native speaker.
Definition of “site path”
Clarified in the PEP. I'm purposely leaving site-path definition to the `site` module rather than saying something too concrete. Hopefully the updates make it clearer now.
Order of Execution
I've added a section for this. The work plans to piggyback on the discovery of `pth` files.
How do pth file sys.path extensions affect the search for __sitecustomize__ directories?
They should not, I've added a section to the PEP to explain it.
Impact on startup time
I've added a more detailed benchmark that can be reproduced.
How does this feature interact with virtual environments?
Outstanding question, I've added a section for it.
Why not explicitly deprecate these
+1. I was trying to be not too ambitious here.
Do you have any concerns that this feature will be overused?
I could see some more people starting to use it, but I think it is a niche case. I expect system administrators and packaging tools (like venv) to use it (and with a better experience).
Will the runtime have any access to the list of __sitecustomize__ modules being executed? I.e. will you collect their paths in a new sys module attribute?
Nope, the PEP explains how to find them programmatically though and for users we expect to add a CLI option to site to list them.
We think io.open_code() should be used so that reading the files can be audited.
Sure thing!
We think io.open_code() should be used so that reading the files can be audited.
I've added a section focused on security. Not sure if you were looking for anything further in terms of security analysis. Regards, Mario On Fri, 2 Apr 2021 at 17:22, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, 31 Mar 2021 at 11:01, Barry Warsaw <barry@python.org> wrote:
Kind of :)
PEP 648 would definitely allow us to deprecate the executable part of
pth files. I let my own biases leak in to my response because I would like to find a way to replace the sys.path feature of pth with something much more auditable and discoverable. To me that means deprecating pth files and finding something better, but maybe not.
Adding pth file auditing to the output of "python -m site" should be entirely feasible, it just hasn't been done yet.
Even if it just listed the files found, it would make them easier to audit than they are today.
Declaring the feature impossible to audit when we haven't even really tried to make it auditable seems premature (the existing site output doesn't even indicate which paths in sys.path will be considered when looking for pth files, let alone indicate which of those directories actually contain any).
Cheers, Nick.
participants (5)
-
Barry Warsaw
-
Christian Heimes
-
Mario Corchero
-
Nick Coghlan
-
Pablo Galindo Salgado