[Python-Dev] inspect.py very slow under 2.5

Ralf Schmitt ralf at brainbot.com
Wed Sep 6 16:53:30 CEST 2006


Nick Coghlan wrote:
> Ralf Schmitt wrote:
>> Nick Coghlan wrote:
>>> It looks like the problem is the call to getabspath() in getmodule(). 
>>> This happens every time, even if the file name is already in the 
>>> modulesbyfile cache. This calls os.path.abspath() and 
>>> os.path.normpath() every time that inspect.findsource() is called.
>>>
>>> That can be fixed by having findsource() pass the filename argument to 
>>> getmodule(), and adding a check of the modulesbyfile cache *before* 
>>> the call to getabspath().
>>>
>>> Can you try this patch and see if you get 2.4 level performance back 
>>> on Fernando's test?:
>> no. this doesn't work. getmodule always iterates over 
>> sys.modules.values() and only returns None afterwards.
>> One would have to cache the bad file value, or only inspect new/changed 
>> modules from sys.modules.
> 
> Good point. I modified the patch so it does the latter (it only calls 
> getabspath() again for a module if the value of module.__file__ changes).

with _filesbymodname[modname] = file changed to 
_filesbymodname[modname] = f
it seems to work ok.

diff -r d41ffd2faa28 inspect.py
--- a/inspect.py	Wed Sep 06 13:01:12 2006 +0200
+++ b/inspect.py	Wed Sep 06 16:52:39 2006 +0200
@@ -403,6 +403,7 @@ def getabsfile(object, _filename=None):
      return os.path.normcase(os.path.abspath(_filename))

  modulesbyfile = {}
+_filesbymodname = {}

  def getmodule(object, _filename=None):
      """Return the module an object was defined in, or None if not 
found."""
@@ -410,17 +411,23 @@ def getmodule(object, _filename=None):
          return object
      if hasattr(object, '__module__'):
          return sys.modules.get(object.__module__)
+    if _filename is not None and _filename in modulesbyfile:
+        return sys.modules.get(modulesbyfile[_filename])
      try:
          file = getabsfile(object, _filename)
      except TypeError:
          return None
      if file in modulesbyfile:
          return sys.modules.get(modulesbyfile[file])
-    for module in sys.modules.values():
+    for modname, module in sys.modules.iteritems():
          if ismodule(module) and hasattr(module, '__file__'):
+            f = module.__file__
+            if f == _filesbymodname.get(modname, None):
+                continue
+            _filesbymodname[modname] = f
              f = getabsfile(module)
              modulesbyfile[f] = modulesbyfile[
-                os.path.realpath(f)] = module.__name__
+                os.path.realpath(f)] = modname
      if file in modulesbyfile:
          return sys.modules.get(modulesbyfile[file])
      main = sys.modules['__main__']
@@ -444,7 +451,7 @@ def findsource(object):
      in the file and the line number indexes a line in that list.  An 
IOError
      is raised if the source code cannot be retrieved."""
      file = getsourcefile(object) or getfile(object)
-    module = getmodule(object)
+    module = getmodule(object, file)
      if module:
          lines = linecache.getlines(file, module.__dict__)
      else:



More information about the Python-Dev mailing list