
On 6 Feb, 11:53 pm, guido@python.org wrote:
On Sat, Feb 6, 2010 at 3:22 PM, <exarkun@twistedmatrix.com> wrote:
On 10:29 pm, guido@python.org wrote:
[snip]
I haven't tried to repro this particular example, but the reason is that we don't want to have to call getpwd() on every import nor do we want to have some kind of in-process variable to cache the current directory. (getpwd() is relatively slow and can sometimes fail outright, and trying to cache it has a certain risk of being wrong.)
Assuming you mean os.getcwd():
Yes.
exarkun@boson:~$ python -m timeit -s 'def f(): pass' 'f()' 10000000 loops, best of 3: 0.132 usec per loop exarkun@boson:~$ python -m timeit -s 'from os import getcwd' 'getcwd()' 1000000 loops, best of 3: 1.02 usec per loop exarkun@boson:~$ So it's about 7x more expensive than a no-op function call. I'd call this pretty quick. Compared to everything else that happens during an import, I'm not convinced this wouldn't be lost in the noise. I think it's at least worth implementing and measuring.
But it's a system call, and its speed depends on a lot more than the speed of a simple function call. It depends on the OS kernel, possibly on the filesystem, and so on.
Do you know of a case where it's actually slow? If not, how convincing should this argument really be? Perhaps we can measure it on a few platforms before passing judgement. For reference, my numbers are from Linux 2.6.31 and my filesystem (though I don't think it really matters) is ext3. I have eglibc 2.10.1 compiled by gcc version 4.4.1.
Also "os.getcwd()" abstracts away various platform details that the C import code would have to replicate.
That logic can all be hidden behind a C API which os.getcwd() can then be implemented in terms of. There's no reason for it to be any harder to invoke from C than it is from Python.
Really, the approach of preprocessing sys.path makes much more sense. If an app wants sys.path[0] to be an absolute path too they can modify it themselves.
That may turn out to be the less expensive approach. I'm not sure in what other ways it is the approach that makes much more sense. Quite the opposite: centralizing the responsibility for normalizing this value makes a lot of sense if you consider things like reducing code duplication and, in turn, removing the possibility for bugs. Adding better documentation for __file__ is another task which I think is worth undertaking, regardless of whether any change is made to how its value is computed. At the moment, the two or three sentences about it in PEP 302 are all I've been able to find, and they don't really get the job done. Jean-Paul