Thank you for your submission of PEP 648 (Extensible customizations of the interpreter at startup). The Python Steering Council has reviewed the PEP and before we can pronounce on it, we have some additional questions and comments we’d like you to address. Once these questions are settled, we are requesting that you post the PEP to python-dev for another round of comments.
In general, the SC is in favor of deprecating the executable hack capabilities of pth files, and this PEP is currently the most concrete proposal to get there. We would like to eventually go farther, including deprecation of pth files entirely, but that is outside the scope of this PEP.
General PEP feedback
The introduction of the section titled “Benefits of __sitecustomize__” seems out of place. It forward references the solution without explanation. It also says “The use of a __sitecustomize__ will allow…”; the question is, a __sitecustomize__ what? Directory? File? Perhaps it should be moved to later in the PEP after the semantics of __sitecustomize__ is defined?
Some of the terminology used in the PEP could be clarified. For example, I suggest using the term “directory” instead of “folder” throughout the PEP, and use the term “module” or “file” instead of “script” to describe the things inside __sitecustomize__ directories that Python imports, since “scripts” typically describe standalone files that implement applications. For example, must the files be named with .py extension (or presumably .pyc, .pyo) under the normal Python module import rules? (Later in the PEP you do allude to the requirement that the file name must end in .py, but it would be much better to explain this right up front, in a formal specification.)
Another ambiguity is what the PEP means by “executing” said “scripts”. Does that mean importing them? Reading them then exec()’ing them? Something else? This section should be clear that Python will import the modules it found, if that’s indeed what the PEP is proposing.
The “Backward compatibility” section has this incomplete sentence: “Ignoring those lines in pth files.”
There’s this odd grammar in the “Do nothing” section: “After analysing the impact of this change, we believe it is worth given the enhanced experience it brings.” Perhaps it should read “...it is worth it, given the…”?
Definition of “site path”
The PEP should precisely define what it means when it says “...a folder named __sitecustomize__ located in any site path”. What exactly is a “site path”? Is this any directory on sys.path? Is it a directory named site-packages?
This phrase is also ambiguous: “As the folder will be within sys.path, given that it is located in site paths...”. “sys.path” isn’t a thing that can hold a folder, since it’s just a list of strings, and it also can get extended by any number of means, e.g. by setting the PYTHONPATH environment variable at interpreter startup time. So for example, how does PYTHONPATH affect the algorithm?
What is the rationale for adding multiple __sitecustomize__ directories, rather than a single directory (or possibly two, one in Python’s system location and one in the user location)? This would simplify the discovery process, and tools like pip can warn the user if a name collision were to occur.
I think the PEP needs to be much more crisp and precise in defining the when and how __sitecustomize__ directories are discovered at start up time.
Order of Execution
I think this section should outline the entire startup execution order. Exactly when do pth files still get evaluated in relationship to __sitecustomize__ discovery? How do pth file sys.path extensions affect the search for __sitecustomize__ directories? This section also says files in __sitecustomize__ will be executed in “file name sorted order”, but the PEP is unclear whether all such files found in every __sitecustomize__ directory will be sorted together, or only within a single __sitecustomize__ directory. Presumably file system encodings are used to determine file name sort order, but I think the PEP should be explicit about this.
I’d like to see the execution order for the entire proposed new Python startup sequence as a recipe, or perhaps even some Python pseudo-code.
Impact on startup time
We still think that startup performance can be a concern. This section needs more data. For example, while the executable part of pth files is being deprecated, both pth and __sitecustomize__ modules will exist. How does this affect startup time in a world where both have to be discovered and processed?
The PEP says: “If the user has custom scripts, we think that the impact on the performance of walking each of the folders is acceptable, as the user wants to use this feature.” Does the user really want to use this feature? Won’t it more likely be the system administrator or Python distribution and packaging ecosystem that chooses to use the __sitecustomize__ feature? Maybe the user won’t even know that it’s being used by the packages being installed?
The PEP says: ““Running "./python -c pass" with perf on 50 iterations, repeating 50 times the command on each and getting the geometric mean on a commodity laptop did not reveal any substantial raise on CPU time.” We request more information on the experiments performed, and specific numbers so that the startup impact can be reproduced and confirmed by others. E.g. did you run this experiment on a branch of CPython 3.10 with the PEP 648 reference implementation? Did you convert existing uses of executable pth files to __sitecustomize__ modules? How many such modules did you try it with? How many __sitecustomize__ directories were in your tests?
Other questions and issues
How does this feature interact with virtual environments? What do packaging tools have to do to support this PEP? The PEP mentions setuptools, but what about other packaging tools in common use?
The PEP says “This impact will be reduced in the future as we will remove two other imports: "sitecustomize.py" and "usercustomize.py”.” Why not explicitly deprecate these and executable pth files in PEP 648 and begin the countdown at the same time __sitecustomize__ is added to Python?
The PEP says: “To facilitate debugging of the Python startup, a new option will be added to the main of the site module to list all scripts that will be executed as part of the __sitecustomize__ initialization.” The PEP should be explicit about what command line switch is being proposed. We think this should probably be an -X option.
Do you have any concerns that this feature will be overused? Does separating the execution feature from the path extension feature make it more or less likely to be used by third party library authors?
Will the runtime have any access to the list of __sitecustomize__ modules being executed? I.e. will you collect their paths in a new sys module attribute? Will there be any functions to import or list __sitecustomize__ directories or modules after start has completed?
We think io.open_code() should be used so that reading the files can be audited.
The Python Steering Council again wants to thank you for your PEP contribution! We hope you find this feedback constructive and helpful. We look forward to your responses, and an updated PEP.
Cheers, -Barry (on behalf of the Python Steering Council)