[ python-Bugs-570300 ] inspect.getmodule symlink-related failur

SourceForge.net noreply at sourceforge.net
Sun Sep 5 21:53:48 CEST 2004


Bugs item #570300, was opened at 2002-06-17 18:24
Message generated for change (Comment added) made by bcannon
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=570300&group_id=5470

Category: Python Library
>Group: Python 2.4
Status: Open
>Resolution: None
Priority: 3
Submitted By: Amit Aronovitch (amitar)
>Assigned to: Johannes Gijsbers (jlgijsbers)
Summary: inspect.getmodule symlink-related failur

Initial Comment:
news:ae3e29$pib$1 at news.netvision.net.il

Description:
--------------

  On a unix python2.2.1 installation I noticed that the 
documentations
generated for modules by pydoc (in any mode - even 
the help command) did NOT
contain any docs for functions.
  After some digging, I found out the reason for that, 
and now I believe it
indicates a deeper problem with the "inspect" module, 
concerning file
identification in the presence of symbolic or hard links, 
which I'll explain
below, and also suggest solutions.

Analysis:
-----------

  The reason the functions were dropped from the doc 
was pydoc's attempt to
remove functions which were imported from other 
modules. This is done by
something like "inspect.getmodule(my_func) is 
my_module". I found out that
inspect.getmodule() returned "None" for these 
functions!

  Now, inspect.getmodule works by getting the 
function's filename, and then
searching it in a dictionary containing the filenames for 
all the modules
that were loaded ("modulesbyfile"). Unfortunately, the 
filename that
getabsfile() returns for the function is not the same 
STRING as the one it
returns for the module, but rather an equivalent unix 
path pointing to the
same FILE
(the reason for this fact is that the filename for a 
function is extracted
from the code-object, which holds the path the module 
was referred to at the
time it was COMPILED to .pyc, whereas the one for 
the module is taken from
it's __file__, which holds the path it was referred to 
when it was
IMPORTED - these two might differ even if it's the 
same file).

 So, the function's file is not found on the dictionary, 
and getmodule()
returns None...

Discussion:
--------------

  We see that the root cause of the problem is 
that "inspect" uses the
"absolute path" (os.path.abspath()) for representing 
the file's identity.
In unix systems, this might cause a problem, since 
this string is NOT unique
(it is a unique path, but different paths may refer to 
the same file).
If we only considered symbolic links, this could be 
resolved by scanning the
path elements and "unfolding" any symlinks, but we 
must recall that unix can
also has "hard links" which are equivalent references to 
the same inode, and
can't be discriminated.

  So, if we want to resolve the problem in a portable 
way, we need an
immutable (platform-dependant) object that will be 
unique to a FILE. This
object could then be used for comparing files and as 
keys for dictionaries.
 A reasonable way to get it would be by means of a 
new function in the os
module. e.g. :

 id = os.get_fileid(filename)
def samefile(f1,f2): return os.get_fileid(f1) is 
os.get_fileid(f2)

 This function could be implemented by the inode 
number (os.stat(f).st_ino)
on unix systems, and by the absolute path 
(os.path.abspath) on systems which
do not support links (such as windows), or by 
anything else, as long as it
would be immutable and unique for each file.


  Please let me know your opinion about this 
suggestion,

       Amit Aronovitch



----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2004-09-05 12:53

Message:
Logged In: YES 
user_id=357491

Reassigning to Johannes since he has checkin rights now.  =)

But honestly I think this bug  is just not worth the hassle.  This is only an 
issue if you are futzing with your file system in an uncommon way.  
'help' is just  for quick checks and thus if doesn't work for *every* 
situation it isn't going to be the end of the world.  Plus I don't like how 
the patch touches so many files with the same chunk of code.

----------------------------------------------------------------------

Comment By: Amit Aronovitch (amitar)
Date: 2004-08-29 06:02

Message:
Logged In: YES 
user_id=564711

Pls see attached files.

Note that jlgijsbers' patch does not resolve the full scope
of the problem as described in my original post (see the
"discussion" part) - namely: it only works for symbolic links.

 bcannon (re 12/7 msg): Sorry. I wrote the long explanations
in hope it would save you time, but it seems they were not
clear enough. 
 To avoid trouble repeating the problem, this time I'll
provide a shell script for testing it.
 Also provided is a proposed patch (against cvs snapshot
from 29 Aug 2004).

 About the patch
-----------------
 I added a "fileid" function to os.path (as suggested in my
original post). This means macpath os2emxpath and ntpath had
to be touched as well as posixpath.
(libposixpath.tex would also need an update if you decide to
adopt this patch)

 Question about inspect.getabsfile
-----------------------------------
 I'm not sure if this function is ment to be an "internal
use" or "interface" function. It does not appear in the
module's documentation (libinspect.tex), but the pydoc
module still uses it (and as far as I could see - it's the
only module that uses it). 
 After my patch, getabsfile is not used internally by
inspect anymore, so should be deleted if "internal use". The
use of this function in pydoc is for human readable output,
so I don't think it's really necessary there (I think
there's no need to do "normcase" there).

 tks for yr attention


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2004-08-13 11:47

Message:
Logged In: YES 
user_id=357491

Checked in as rev. 1.52 for 'inspect'.  Not going to backport since it is a 
semantic change.

Thanks for the patch, Johannes.

----------------------------------------------------------------------

Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2004-08-13 10:08

Message:
Logged In: YES 
user_id=469548

I can reproduce the problem using the steps outlined below.
Replacing the line (it's not even worth creating a new patch
item):

modulesbyfile[getabsfile(module)] = module.__name__

with 

modulesbyfile[os.path.realpath(getabsfile(module))] =
module.__name__

fixes the problem.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2004-07-12 17:26

Message:
Logged In: YES 
user_id=357491

Well, when a bug gets old and you don't have a test to make sure it has 
been fixed, yes, things just do "go away" in the stdlib.  The amount of 
code change in the stdlib can easily lead to some other bug being fixed.

And I did read it.  But when the problem stopped presenting itself to me 
(and I don't know why; I spent a good amount of time on this on the July 
10 Bug Day) I figured it was gone.  If I can't reproduce it I can't try to fix 
it.

But if you can come up with a patch to fix this feel free to assign it to 
me.

----------------------------------------------------------------------

Comment By: Amit Aronovitch (amitar)
Date: 2004-07-12 15:08

Message:
Logged In: YES 
user_id=564711

In my experience, problems don't just "go away" by
themselves. Someone needs to actually fix them.
 So, I tested on 2.4a - and results are EXACTLY THE SAME
(attached printout).

 It seems that no-one got to actually READ this lengthy
description, so I'll have to send patches. Sorry I did not
do that already, and sorry again but it seems I'm not going
to get to that soon enough. I'll try to get it done by the
end of July.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2004-07-10 11:23

Message:
Logged In: YES 
user_id=357491

Well, looks like this problem has gone away, at least in 2.4.  Closing out 
outdated.

----------------------------------------------------------------------

Comment By: Amit Aronovitch (amitar)
Date: 2003-05-18 13:10

Message:
Logged In: YES 
user_id=564711

Sorry - seems like I forgot most basic step in prob-
reporting - the "howtorepeat" :-) - so here it comes:

How to repeat:
----------------------
(as I said - you need unix & symlinks to see this happening):

~> mkdir test
~> setenv PYTHONPATH ~/test
~> cat >test/test_mod.py
"module doc"
def blah():
   "hello"
   pass
^D
~> python
>> import test_mod
>> help(test_mod)
>> ^D
[ Prints help - so far so good - no problem - but see now]

~> ln -s test test2
~> setenv PYTHONPATH test2
~> python
>> import test_mod
>> help(test_mod)
[ Now the help shows up without the help of the blah 
function]

Relating the example to my explanations above:
------------------------------------------------------------------------
The help of the blah() function is filtered out, 
because "inspect" takes "~/test/test_mod.pyc" as it's 
filename, and "~/test1/test_mod.pyc" as the module's 
filename. It can't tell that these are the same file (see 
details in my "Analysis" section above).

  True, this messing up with symlinks and PYTHONPATH is 
a bit ugly, but this is just to demonstrate the problem. The 
system where I noticed it is quite complex, with disks 
shared (automounted) across several platforms, and it 
needs a few symlinks to make things easyer to maintain.
  As I explained, I think that few little changes in modules 
such as "inspect" and "os" can make them identify files 
better in the presence of links.


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-16 19:34

Message:
Logged In: YES 
user_id=357491

Just tested under 2.2.2 and 2.3b1 using a module containing just::

 def blah():
    """Hello""""
    pass

Ran ```help(test_mod)``` and had it spit out a FUNCTIONS section with the 
name of the function and its docstring.  Am I missing something here?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=570300&group_id=5470


More information about the Python-bugs-list mailing list