package imports, sys.path and os.chdir()

Howdy, I have a small problem/observation with imports. I have several packages to import, which works all fine, as long as the packages are imported from directories found on the installed site-packages, via .pth etc. The only problem is the automatically prepended empty string in sys.path. Depending from where I start my application, the values stored in package.__file__ and package.__path__ are absolute or relative paths. So, if my pwd is the directory that contains my top-level modules, even though sys.path contains correct absolute entries for that, in this case the '' entry wins. Assume this: <- cwd is here moda modb
import moda
Some code happens to chdir away, and later some code does
from moda import modb
Since the __path__ entry is now a relative path, this second import fails. Although it is no recommended practice to leave a changed chdir(), I don't see why this is so. When a module is imported, would it not be better to always make __file__ and __path__ absolute? I see the module path, hidden by the '' entry not as a feature but an undesired side-effect. No big deal and easy to work around, I just would like to understand why. cheers -- chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Fri, Apr 27, 2012 at 7:30 AM, Christian Tismer <tismer@stackless.com> wrote:
No big deal and easy to work around, I just would like to understand why.
I don't like it either and want to change it, but I'm also not going to mess with it until the importlib bootstrapping is fully integrated and stable. For the moment, there's a workaround in runpy to ensure at least __main__.__file__ is always absolute (even when using the -m switch). Longer term, I'd like to see __file__ and __path__ entries to be guaranteed to be *always* absolutely, even when they're imported relative to the current working directory. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27.04.12 02:39, Nick Coghlan wrote:
Is there a recommendable way to fix this? I would like to tell people what to do to make imports reliable. Either I put something into the toplevel __init__ code, or I hack something into .pth or sitecustomize, and then forget about this. But I fear hacking __init__ is the only safe way that works without a special python setup, which makes the whole reasoning rather useless, because I can _not_ forget about this.... waah ;-) cheers - chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Fri, Apr 27, 2012 at 10:39, Christian Tismer <tismer@stackless.com>wrote:
No, there isn't.
Yeah, to guarantee the semantics you are after you have to grab that '' entry in sys.path as early as possible and substitute it with the cwd so that its initial value propagates through the interpreter. Importlib is already having to jump through some hoops to treat it as '.' and even that doesn't get you what you want since that will change when the cwd is moved. I'm personally in favour of changing the insertion of '' to sys.path to inserting the cwd when the interpreter is launched.

On 27.04.12 22:00, Brett Cannon wrote:
Thanks Brett, that sounds pretty reasonable. '' always was too implicit for me. cheers - chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Sat, Apr 28, 2012 at 6:00 AM, Brett Cannon <brett@python.org> wrote:
I'm personally in favour of changing the insertion of '' to sys.path to inserting the cwd when the interpreter is launched.
I'm not, because it breaks importing from the interactive prompt if you change directory after starting the session. The existing workaround for applications is pretty trivial: # Somewhere in your initialisation code for i, entry in enumerate(sys.path): sys.path[i] = os.path.abspath(i) The fix for the import system is similarly trivial: call os.path.abspath when calculating __file__ (just as runpy now does and the import emulation in pkgutil always has). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, 28 Apr 2012 18:08:08 +1000, Nick Coghlan <ncoghlan@gmail.com> wrote:
Heh. I've never thought of doing that. I would not have expected it to work (change directory from the interactive prompt and be able to import something located in the new cwd). I don't know why I wouldn't have expected it to work, I just didn't. That said, could this insertion of '' only happen when the interactive prompt is actually posted, and otherwise use cwd? --David

On Sat, Apr 28, 2012 at 12:16, R. David Murray <rdmurray@bitdance.com>wrote:
If the decision to keep this entry around stands, can we consider changing it to '.' instead of the empty string? It mucks up stuff if you are not careful (e.g. ``os.listdir('')`` or ``"/".join(['', 'filename.py'])``).

On Sat, Apr 28, 2012 at 12:16 PM, R. David Murray <rdmurray@bitdance.com>wrote:
That said, could this insertion of '' only happen when the interactive prompt is actually posted, and otherwise use cwd?
That's already the case. Actually, sys.path[0] is *always* the absolute path of the script directory -- regardless of whether you invoked the script by a relative path or an absolute one, and regardless of whether you're importing 'site' -- at least on Linux and Cygwin and WIndows, for all Python versions I've used regularly, and 3.2 besides. It isn't the value of cwd unless you happen to run a script from the same directory as the script itself. But even then, it's absolute, and not an empty string: the empty string is only present for interactive sessions.

On Sun, Apr 29, 2012 at 1:41 PM, PJ Eby <pje@telecommunity.com> wrote:
"-c" and "-m" also insert the empty string as sys.path[0] in order to find local files. They could just as easily insert the full cwd explicitly though, and, in fact, they arguably should. (I say arguably, because changing this *would* be a backwards incompatible change - there's no such issue with requiring __file__ to be absolute). If we fixed that, then you could only get relative filenames from the interactive prompt. There's another way we can go with this, though: something I'm working on at the moment is having usage of the frozen importlib be *temporary*, switching to the full Python source version as soon as possible (i.e. as soon as the frozen version is able to retrieve the full version from disk). There's a trick that becomes possible if we go down that path: we can have some elements of importlib._bootstrap that *don't run* during the initial bootstrapping phase. Specifically, we can have module level code that looks like this: if __name__.startswith("importlib."): # Import system has been bootstrapped with the frozen version, we now have full stdlib access # and other parts of the interpreter have also been fully initialised from os.path import abspath as _abspath _debug_msg = print else: # Running from the frozen copy, there's things we can't do yet because the interpreter is not fully configured def _abspath(entry): # During the bootstrap process, we let relative paths slide. It will only happen if someone shadows the stdlib in their # current directory. return entry def _debug_msg(*args, **kwds): # Standard streams are not initialised yet pass Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 29.04.12 07:05, Nick Coghlan wrote:
As a note: I tried to find out where and when the empty string actually got inserted into sys.path. Not very easy, had to run the C debugger to understand that: It happens in sysmodule.c PyMain PySys_SetArgv(argc-_PyOS_optind, argv+_PyOS_optind); that calls PySys_SetArgvEx(int argc, char **argv, int updatepath) and the logic weather to use the empty string or a full path etc. is deeply hidden in a C function as a side effect. Brrrrrr! It would be much cleaner and easier if that stuff would be ignored today and called a Python implementation, instead. Is that in the plans to get rid of C for such stuff? I hope so :-) cheers -- Chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Sat, Apr 28, 2012 at 04:08, Nick Coghlan <ncoghlan@gmail.com> wrote:
Who does that? I mean what possible need do you have to start the interpreter in one directory, but then need to chdir somewhere else where you are doing your actual importing from, and in a way where you can't simply attach the directory you want to use into sys.path?
You say trivial, I say a pain as that means porting over os.path.abspath() into importlib._bootstrap that works for all platforms. -Brett

On 4/28/2012 3:16 PM, Brett Cannon wrote:
Idle, at least on Windows, when started from the installed icon, starts in the directory of the associated pythonw.exe. There is no choice. And that is a bad place to put user files for import. So anyone using Idle and importing user files does just what you think is strange. Windows ain't *nix. If one opens a file in another directory*, that becomes the new current directory and imports from that directory work. I would not want that to change. I presume that changing '' to '.' would not change that. *and the easiest way to do *that* is from the 'recent files' list. I almost never type a path on Windows. -- Terry Jan Reedy

Brett Cannon wrote:
Me. You're asking this as if it were a bizarre and disturbing thing to do. It's not as if changing directory is an unsupported hack. When I use the Python interactive interpreter for interactive exploration or testing, sometimes I discover I'm in the wrong directory. If I've just started a fresh session, I'll probably just exit back to the shell, cd, then start Python again. But if there's significant history in the current session, I'll just change directories and continue on.
Of course I could manipulate sys.path. But chances are that I still have to change directory anyway, so that reading and writing data files go where I want without having to specify absolute paths. -- Steven

On 28.04.12 21:16, Brett Cannon wrote:
Well, it depends on which hat I'm wearing. Scenario 1: I am designing a big application. This application shall run without problems, with disambiguated imports, and by no means should hit anything that is not meant to be imported. In this case, I need to remove '' from sys.path and replace it with an absolute entry. Update: I see this works already unless "-c" and "-m" are present (hum). Scenario 2: I am playing with the application, want to try several modules, or even several versions of modules. I do use os.chdir() to get into a certain context, try imports, remove them again, chdir() to a different directory with a slightly changed module, et cetera. In this case, I need '' (or as has been mentioned '.') to have flexibility for testing, debugging and exploration. These scenarios are both perfectly valid for their use case, but they have pretty different implication for imports, and especially for sys.path. So the real question I was after was "can os.chdir() be freely used?" It would be great to get "yes" or "no", but the answer is right now "it depends". cheers - chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Fri, Apr 27, 2012 at 7:30 AM, Christian Tismer <tismer@stackless.com> wrote:
No big deal and easy to work around, I just would like to understand why.
I don't like it either and want to change it, but I'm also not going to mess with it until the importlib bootstrapping is fully integrated and stable. For the moment, there's a workaround in runpy to ensure at least __main__.__file__ is always absolute (even when using the -m switch). Longer term, I'd like to see __file__ and __path__ entries to be guaranteed to be *always* absolutely, even when they're imported relative to the current working directory. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27.04.12 02:39, Nick Coghlan wrote:
Is there a recommendable way to fix this? I would like to tell people what to do to make imports reliable. Either I put something into the toplevel __init__ code, or I hack something into .pth or sitecustomize, and then forget about this. But I fear hacking __init__ is the only safe way that works without a special python setup, which makes the whole reasoning rather useless, because I can _not_ forget about this.... waah ;-) cheers - chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Fri, Apr 27, 2012 at 10:39, Christian Tismer <tismer@stackless.com>wrote:
No, there isn't.
Yeah, to guarantee the semantics you are after you have to grab that '' entry in sys.path as early as possible and substitute it with the cwd so that its initial value propagates through the interpreter. Importlib is already having to jump through some hoops to treat it as '.' and even that doesn't get you what you want since that will change when the cwd is moved. I'm personally in favour of changing the insertion of '' to sys.path to inserting the cwd when the interpreter is launched.

On 27.04.12 22:00, Brett Cannon wrote:
Thanks Brett, that sounds pretty reasonable. '' always was too implicit for me. cheers - chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Sat, Apr 28, 2012 at 6:00 AM, Brett Cannon <brett@python.org> wrote:
I'm personally in favour of changing the insertion of '' to sys.path to inserting the cwd when the interpreter is launched.
I'm not, because it breaks importing from the interactive prompt if you change directory after starting the session. The existing workaround for applications is pretty trivial: # Somewhere in your initialisation code for i, entry in enumerate(sys.path): sys.path[i] = os.path.abspath(i) The fix for the import system is similarly trivial: call os.path.abspath when calculating __file__ (just as runpy now does and the import emulation in pkgutil always has). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, 28 Apr 2012 18:08:08 +1000, Nick Coghlan <ncoghlan@gmail.com> wrote:
Heh. I've never thought of doing that. I would not have expected it to work (change directory from the interactive prompt and be able to import something located in the new cwd). I don't know why I wouldn't have expected it to work, I just didn't. That said, could this insertion of '' only happen when the interactive prompt is actually posted, and otherwise use cwd? --David

On Sat, Apr 28, 2012 at 12:16, R. David Murray <rdmurray@bitdance.com>wrote:
If the decision to keep this entry around stands, can we consider changing it to '.' instead of the empty string? It mucks up stuff if you are not careful (e.g. ``os.listdir('')`` or ``"/".join(['', 'filename.py'])``).

On Sat, Apr 28, 2012 at 12:16 PM, R. David Murray <rdmurray@bitdance.com>wrote:
That said, could this insertion of '' only happen when the interactive prompt is actually posted, and otherwise use cwd?
That's already the case. Actually, sys.path[0] is *always* the absolute path of the script directory -- regardless of whether you invoked the script by a relative path or an absolute one, and regardless of whether you're importing 'site' -- at least on Linux and Cygwin and WIndows, for all Python versions I've used regularly, and 3.2 besides. It isn't the value of cwd unless you happen to run a script from the same directory as the script itself. But even then, it's absolute, and not an empty string: the empty string is only present for interactive sessions.

On Sun, Apr 29, 2012 at 1:41 PM, PJ Eby <pje@telecommunity.com> wrote:
"-c" and "-m" also insert the empty string as sys.path[0] in order to find local files. They could just as easily insert the full cwd explicitly though, and, in fact, they arguably should. (I say arguably, because changing this *would* be a backwards incompatible change - there's no such issue with requiring __file__ to be absolute). If we fixed that, then you could only get relative filenames from the interactive prompt. There's another way we can go with this, though: something I'm working on at the moment is having usage of the frozen importlib be *temporary*, switching to the full Python source version as soon as possible (i.e. as soon as the frozen version is able to retrieve the full version from disk). There's a trick that becomes possible if we go down that path: we can have some elements of importlib._bootstrap that *don't run* during the initial bootstrapping phase. Specifically, we can have module level code that looks like this: if __name__.startswith("importlib."): # Import system has been bootstrapped with the frozen version, we now have full stdlib access # and other parts of the interpreter have also been fully initialised from os.path import abspath as _abspath _debug_msg = print else: # Running from the frozen copy, there's things we can't do yet because the interpreter is not fully configured def _abspath(entry): # During the bootstrap process, we let relative paths slide. It will only happen if someone shadows the stdlib in their # current directory. return entry def _debug_msg(*args, **kwds): # Standard streams are not initialised yet pass Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 29.04.12 07:05, Nick Coghlan wrote:
As a note: I tried to find out where and when the empty string actually got inserted into sys.path. Not very easy, had to run the C debugger to understand that: It happens in sysmodule.c PyMain PySys_SetArgv(argc-_PyOS_optind, argv+_PyOS_optind); that calls PySys_SetArgvEx(int argc, char **argv, int updatepath) and the logic weather to use the empty string or a full path etc. is deeply hidden in a C function as a side effect. Brrrrrr! It would be much cleaner and easier if that stuff would be ignored today and called a Python implementation, instead. Is that in the plans to get rid of C for such stuff? I hope so :-) cheers -- Chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

On Sat, Apr 28, 2012 at 04:08, Nick Coghlan <ncoghlan@gmail.com> wrote:
Who does that? I mean what possible need do you have to start the interpreter in one directory, but then need to chdir somewhere else where you are doing your actual importing from, and in a way where you can't simply attach the directory you want to use into sys.path?
You say trivial, I say a pain as that means porting over os.path.abspath() into importlib._bootstrap that works for all platforms. -Brett

On 4/28/2012 3:16 PM, Brett Cannon wrote:
Idle, at least on Windows, when started from the installed icon, starts in the directory of the associated pythonw.exe. There is no choice. And that is a bad place to put user files for import. So anyone using Idle and importing user files does just what you think is strange. Windows ain't *nix. If one opens a file in another directory*, that becomes the new current directory and imports from that directory work. I would not want that to change. I presume that changing '' to '.' would not change that. *and the easiest way to do *that* is from the 'recent files' list. I almost never type a path on Windows. -- Terry Jan Reedy

Brett Cannon wrote:
Me. You're asking this as if it were a bizarre and disturbing thing to do. It's not as if changing directory is an unsupported hack. When I use the Python interactive interpreter for interactive exploration or testing, sometimes I discover I'm in the wrong directory. If I've just started a fresh session, I'll probably just exit back to the shell, cd, then start Python again. But if there's significant history in the current session, I'll just change directories and continue on.
Of course I could manipulate sys.path. But chances are that I still have to change directory anyway, so that reading and writing data files go where I want without having to specify absolute paths. -- Steven

On 28.04.12 21:16, Brett Cannon wrote:
Well, it depends on which hat I'm wearing. Scenario 1: I am designing a big application. This application shall run without problems, with disambiguated imports, and by no means should hit anything that is not meant to be imported. In this case, I need to remove '' from sys.path and replace it with an absolute entry. Update: I see this works already unless "-c" and "-m" are present (hum). Scenario 2: I am playing with the application, want to try several modules, or even several versions of modules. I do use os.chdir() to get into a certain context, try imports, remove them again, chdir() to a different directory with a slightly changed module, et cetera. In this case, I need '' (or as has been mentioned '.') to have flexibility for testing, debugging and exploration. These scenarios are both perfectly valid for their use case, but they have pretty different implication for imports, and especially for sys.path. So the real question I was after was "can os.chdir() be freely used?" It would be great to get "yes" or "no", but the answer is right now "it depends". cheers - chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
participants (9)
-
Benjamin Peterson
-
Brett Cannon
-
Christian Tismer
-
Glenn Linderman
-
Nick Coghlan
-
PJ Eby
-
R. David Murray
-
Steven D'Aprano
-
Terry Reedy