[ python-Bugs-570300 ] inspect.getmodule symlink-related failur

SourceForge.net noreply at sourceforge.net
Fri Aug 13 20:47:26 CEST 2004


Bugs item #570300, was opened at 2002-06-17 18:24
Message generated for change (Comment added) made by bcannon
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=570300&group_id=5470

Category: Python Library
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 3
Submitted By: Amit Aronovitch (amitar)
>Assigned to: Brett Cannon (bcannon)
Summary: inspect.getmodule symlink-related failur

Initial Comment:
news:ae3e29$pib$1 at news.netvision.net.il

Description:
--------------

  On a unix python2.2.1 installation I noticed that the 
documentations
generated for modules by pydoc (in any mode - even 
the help command) did NOT
contain any docs for functions.
  After some digging, I found out the reason for that, 
and now I believe it
indicates a deeper problem with the "inspect" module, 
concerning file
identification in the presence of symbolic or hard links, 
which I'll explain
below, and also suggest solutions.

Analysis:
-----------

  The reason the functions were dropped from the doc 
was pydoc's attempt to
remove functions which were imported from other 
modules. This is done by
something like "inspect.getmodule(my_func) is 
my_module". I found out that
inspect.getmodule() returned "None" for these 
functions!

  Now, inspect.getmodule works by getting the 
function's filename, and then
searching it in a dictionary containing the filenames for 
all the modules
that were loaded ("modulesbyfile"). Unfortunately, the 
filename that
getabsfile() returns for the function is not the same 
STRING as the one it
returns for the module, but rather an equivalent unix 
path pointing to the
same FILE
(the reason for this fact is that the filename for a 
function is extracted
from the code-object, which holds the path the module 
was referred to at the
time it was COMPILED to .pyc, whereas the one for 
the module is taken from
it's __file__, which holds the path it was referred to 
when it was
IMPORTED - these two might differ even if it's the 
same file).

 So, the function's file is not found on the dictionary, 
and getmodule()
returns None...

Discussion:
--------------

  We see that the root cause of the problem is 
that "inspect" uses the
"absolute path" (os.path.abspath()) for representing 
the file's identity.
In unix systems, this might cause a problem, since 
this string is NOT unique
(it is a unique path, but different paths may refer to 
the same file).
If we only considered symbolic links, this could be 
resolved by scanning the
path elements and "unfolding" any symlinks, but we 
must recall that unix can
also has "hard links" which are equivalent references to 
the same inode, and
can't be discriminated.

  So, if we want to resolve the problem in a portable 
way, we need an
immutable (platform-dependant) object that will be 
unique to a FILE. This
object could then be used for comparing files and as 
keys for dictionaries.
 A reasonable way to get it would be by means of a 
new function in the os
module. e.g. :

 id = os.get_fileid(filename)
def samefile(f1,f2): return os.get_fileid(f1) is 
os.get_fileid(f2)

 This function could be implemented by the inode 
number (os.stat(f).st_ino)
on unix systems, and by the absolute path 
(os.path.abspath) on systems which
do not support links (such as windows), or by 
anything else, as long as it
would be immutable and unique for each file.


  Please let me know your opinion about this 
suggestion,

       Amit Aronovitch



----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2004-08-13 11:47

Message:
Logged In: YES 
user_id=357491

Checked in as rev. 1.52 for 'inspect'.  Not going to backport since it is a 
semantic change.

Thanks for the patch, Johannes.

----------------------------------------------------------------------

Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2004-08-13 10:08

Message:
Logged In: YES 
user_id=469548

I can reproduce the problem using the steps outlined below.
Replacing the line (it's not even worth creating a new patch
item):

modulesbyfile[getabsfile(module)] = module.__name__

with 

modulesbyfile[os.path.realpath(getabsfile(module))] =
module.__name__

fixes the problem.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2004-07-12 17:26

Message:
Logged In: YES 
user_id=357491

Well, when a bug gets old and you don't have a test to make sure it has 
been fixed, yes, things just do "go away" in the stdlib.  The amount of 
code change in the stdlib can easily lead to some other bug being fixed.

And I did read it.  But when the problem stopped presenting itself to me 
(and I don't know why; I spent a good amount of time on this on the July 
10 Bug Day) I figured it was gone.  If I can't reproduce it I can't try to fix 
it.

But if you can come up with a patch to fix this feel free to assign it to 
me.

----------------------------------------------------------------------

Comment By: Amit Aronovitch (amitar)
Date: 2004-07-12 15:08

Message:
Logged In: YES 
user_id=564711

In my experience, problems don't just "go away" by
themselves. Someone needs to actually fix them.
 So, I tested on 2.4a - and results are EXACTLY THE SAME
(attached printout).

 It seems that no-one got to actually READ this lengthy
description, so I'll have to send patches. Sorry I did not
do that already, and sorry again but it seems I'm not going
to get to that soon enough. I'll try to get it done by the
end of July.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2004-07-10 11:23

Message:
Logged In: YES 
user_id=357491

Well, looks like this problem has gone away, at least in 2.4.  Closing out 
outdated.

----------------------------------------------------------------------

Comment By: Amit Aronovitch (amitar)
Date: 2003-05-18 13:10

Message:
Logged In: YES 
user_id=564711

Sorry - seems like I forgot most basic step in prob-
reporting - the "howtorepeat" :-) - so here it comes:

How to repeat:
----------------------
(as I said - you need unix & symlinks to see this happening):

~> mkdir test
~> setenv PYTHONPATH ~/test
~> cat >test/test_mod.py
"module doc"
def blah():
   "hello"
   pass
^D
~> python
>> import test_mod
>> help(test_mod)
>> ^D
[ Prints help - so far so good - no problem - but see now]

~> ln -s test test2
~> setenv PYTHONPATH test2
~> python
>> import test_mod
>> help(test_mod)
[ Now the help shows up without the help of the blah 
function]

Relating the example to my explanations above:
------------------------------------------------------------------------
The help of the blah() function is filtered out, 
because "inspect" takes "~/test/test_mod.pyc" as it's 
filename, and "~/test1/test_mod.pyc" as the module's 
filename. It can't tell that these are the same file (see 
details in my "Analysis" section above).

  True, this messing up with symlinks and PYTHONPATH is 
a bit ugly, but this is just to demonstrate the problem. The 
system where I noticed it is quite complex, with disks 
shared (automounted) across several platforms, and it 
needs a few symlinks to make things easyer to maintain.
  As I explained, I think that few little changes in modules 
such as "inspect" and "os" can make them identify files 
better in the presence of links.


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-16 19:34

Message:
Logged In: YES 
user_id=357491

Just tested under 2.2.2 and 2.3b1 using a module containing just::

 def blah():
    """Hello""""
    pass

Ran ```help(test_mod)``` and had it spit out a FUNCTIONS section with the 
name of the function and its docstring.  Am I missing something here?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=570300&group_id=5470


More information about the Python-bugs-list mailing list