[Numpy-discussion] "import numpy" is slow

Andrew Dalke dalke at dalkescientific.com
Fri Jul 4 08:22:59 EDT 2008


On Jul 3, 2008, at 9:06 AM, Robert Kern wrote:
> Can you try the SVN trunk?

Sure.  Though did you know it's not easy to find how to get numpy  
from SVN?  I had to go to the second page of Google, which linked to  
someone's talk.

I expected to find a link to it at http://numpy.scipy.org/ .
Just like I expected to find a link to the numpy mailing list.

Okay, compiled.

[josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c  
'pass'
0.015u 0.042s 0:00.06 83.3%     0+0k 0+0io 0pf+0w
[josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c  
'import numpy'
0.084u 0.231s 0:00.33 93.9%     0+0k 0+8io 0pf+0w
[josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke%

Previously it took 0.44 seconds so it's now 24% faster.


> I would be interested to know how significantly it improves your  
> use case.


For one of my clients I wrote a tool to analyze import times.  I  
don't have it, but here's something similar I just now whipped up:

import time

seen = set()
import_order = []
elapsed_times = {}
level = 0
parent = None
children = {}

def new_import(name, globals, locals, fromlist):
     global level, parent
     if name in seen:
         return old_import(name, globals, locals, fromlist)
     seen.add(name)
     import_order.append((name, level, parent))
     t1 = time.time()
     old_parent = parent
     parent = name
     level += 1
     module = old_import(name, globals, locals, fromlist)
     level -= 1
     parent = old_parent
     t2 = time.time()
     elapsed_times[name] = t2-t1
     return module

old_import = __builtins__.__import__

__builtins__.__import__ = new_import

import numpy

parents = {}
for name, level, parent in import_order:
     parents[name] = parent

print "== Tree =="
for name, level,parent in import_order:
     print "%s%s: %.3f (%s)" % (" "*level, name, elapsed_times[name],  
parent)

print "\n"
print "== Slowest (including children) =="
slowest = sorted((t, name) for (name, t) in elapsed_times.items())[-20:]
for elapsed_time, name in slowest[::-1]:
     print "%.3f %s (%s)" % (elapsed_time, name, parents[name])


The result using the version out of subversion is

== Tree ==
numpy: 0.237 (None)
  numpy.__config__: 0.000 (numpy)
  version: 0.000 (numpy)
   os: 0.000 (version)
   imp: 0.000 (version)
  _import_tools: 0.024 (numpy)
   sys: 0.000 (_import_tools)
   glob: 0.024 (_import_tools)
    fnmatch: 0.020 (glob)
     re: 0.018 (fnmatch)
      sre_compile: 0.009 (re)
       _sre: 0.000 (sre_compile)
       sre_constants: 0.004 (sre_compile)
      sre_parse: 0.006 (re)
      copy_reg: 0.000 (re)
  add_newdocs: 0.156 (numpy)
   lib: 0.150 (add_newdocs)
    info: 0.000 (lib)
    numpy.version: 0.000 (lib)
    type_check: 0.091 (lib)

   ... many lines removed ...

  mtrand: 0.021 (numpy)
  ctypeslib: 0.024 (numpy)
   ctypes: 0.023 (ctypeslib)
    _ctypes: 0.003 (ctypes)
    gestalt: 0.013 (ctypes)
    ctypes._endian: 0.001 (ctypes)
   numpy.core._internal: 0.000 (ctypeslib)
  ma: 0.005 (numpy)
   extras: 0.001 (ma)
    numpy.lib.index_tricks: 0.000 (extras)
    numpy.lib.polynomial: 0.000 (extras)


== Slowest (including children) ==
0.237 numpy (None)
0.156 add_newdocs (numpy)
0.150 lib (add_newdocs)
0.091 type_check (lib)
0.090 numpy.core.numeric (type_check)
0.049 io (lib)
0.048 numpy.testing (numpy.core.numeric)
0.024 _import_tools (numpy)
0.024 ctypeslib (numpy)
0.024 glob (_import_tools)
0.023 ctypes (ctypeslib)
0.022 utils (numpy.testing)
0.022 difflib (utils)
0.021 mtrand (numpy)
0.020 fnmatch (glob)
0.020 _datasource (io)
0.020 tempfile (io)
0.018 re (fnmatch)
0.018 heapq (difflib)
0.013 gestalt (ctypes)

This only reports the first time a module is imported so fixing, say,  
the 'glob' in _import_tools doesn't mean it won't appear elsewhere.


				Andrew
				dalke at dalkescientific.com





More information about the NumPy-Discussion mailing list