Change to sys.path[0] for Python 2.3
I have been working on import.c and thinking about imports generally. Currently, the directory of the Python script is inserted into sys.path[0]. For example, "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and "python myscript.py" creates sys.path[0] = "". But there are three problems. This insertion occurs after a number of imports have already occurred. Specifically, it occurs after the import of site, os, and sitecustomize. This is confusing. It is clear that sys.path should not change unless the user changes it. If no path component is given, the zero length string is inserted. But if the current working directory later changes, this is no longer valid. If we want the directory of the script to be sys.path[0], then an absolute path should be inserted. If a command is entered using "-c", I don't think any insertion to sys.path should be made, as there is no indicated directory. Alternatively, the absolute path getcwd() should be inserted. If everyone agrees, I will create a patch. Jim Ahlstrom
Jim "Fortran" Ahlstrom wrote:
I have been working on import.c and thinking about imports generally. Currently, the directory of the Python script is inserted into sys.path[0]. For example, "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and "python myscript.py" creates sys.path[0] = "". But there are three problems.
This insertion occurs after a number of imports have already occurred. Specifically, it occurs after the import of site, os, and sitecustomize. This is confusing.
Why? site happens before Python even thinks about sys.argv[0]. By it's very name it's about "how this installation should behave", not "how a script in this directory behaves".
It is clear that sys.path should not change unless the user changes it.
I am +1 (for very large values of 1) on clarifying the rules of import, but while hidden manipulations of sys.path qualify as a sneaky trick, I don't think they can be outlawed.
If no path component is given, the zero length string is inserted. But if the current working directory later changes, this is no longer valid. If we want the directory of the script to be sys.path[0], then an absolute path should be inserted.
Some people os.chdir() just for this effect. I don't think I mind if they experience some pain <wink>. - Gordon
Gordon McMillan wrote:
Jim "Fortran" Ahlstrom wrote:
This insertion occurs after a number of imports have already occurred. Specifically, it occurs after the import of site, os, and sitecustomize. This is confusing.
Why? site happens before Python even thinks about sys.argv[0]. By it's very name it's about "how this installation should behave", not "how a script in this directory behaves".
Adding the directory of the Python script occurs after a number of imports have already happened. This is not necessary. The directory of the script is known. It is confusing because the programmer sees that the script directory is the first item of sys.path, and so concludes that [s]he can put scripts there and have them imported. This sometimes works, but fails for os, site, sitecustomize and a few others. There is no reason for this other that accidental details of the implementation. Think of documenting imports. We would need to explain that sys.argv[0] is Special, and different from other items. Yuk. The programmer may indeed manipluate sys.path, but at least Python's default path should be simple. BTW, is there any documentation on the details of imports, even a description of sys.path? We need an "invocation and imports" manual (which I guess I just volunteered to write).
If no path component is given, the zero length string is inserted. But if the current working directory later changes, this is no longer valid. If we want the directory of the script to be sys.path[0], then an absolute path should be inserted.
Some people os.chdir() just for this effect. I don't think I mind if they experience some pain <wink>.
This is a problem for zip imports and directory caching. If an item of sys.path is a relative path, and getcwd() changes, then it is difficult (as in slow, not as in impossible) to get caching to work. Think of sys.path item "./archive.zip". There is a dictionary full of items starting with "./", and then CWD changes. I would have to recognize a relative path for any item of sys.path, and call getcwd() for each one, all for each supported OS. Not fun. Using my current code, zip imports fail if a relative path is given and getcwd() changes. This is a problem especially on Windows, as normally the CWD changes to the directory of an opened file. I think most of the practical problems go away if sys.path[0] is an absolute path. Then I can either make relative paths work with the penalty that imports will be slower, or write documentation that sys.path[] must only contain absolute paths. In either case, I think sys.path[0] should be an absolute path. Again, think about documenting imports. What is sys.path[0]? It is the directory of the script. This is a fixed directory that doesn't change. JimA
"James C. Ahlstrom"
If we want the directory of the script to be sys.path[0], then an absolute path should be inserted.
+1 Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
"James C. Ahlstrom" wrote:
I have been working on import.c and thinking about imports generally. Currently, the directory of the Python script is inserted into sys.path[0]. For example, "python /A/B/myscript.py" creates sys.path[0] = "/A/B", and "python myscript.py" creates sys.path[0] = "". But there are three problems.
This insertion occurs after a number of imports have already occurred. Specifically, it occurs after the import of site, os, and sitecustomize. This is confusing. It is clear that sys.path should not change unless the user changes it.
I hope you mean user == programmer. Changing sys.path is perfectly legal and I wouldn't like to see that become illegal.
If no path component is given, the zero length string is inserted. But if the current working directory later changes, this is no longer valid. If we want the directory of the script to be sys.path[0], then an absolute path should be inserted.
True. This causes quite a bit of confusion sometimes, esp. when people run scripts using relative paths and then find that things don't work the way they expected. I'm not sure if adding the absolute path would break anything, though -- could be that some path fiddling code explicitly looks for the '' in sys.path and then takes some action based on the fact that the script was started from the CWD.
If a command is entered using "-c", I don't think any insertion to sys.path should be made, as there is no indicated directory. Alternatively, the absolute path getcwd() should be inserted.
Same problem here: "-c" can be used as indicator... and probably is by code looking for the absolute path of the script ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
participants (4)
-
Gordon McMillan
-
Greg Ewing
-
James C. Ahlstrom
-
M.-A. Lemburg