[Numpy-discussion] slow import of numpy modules

David Cournapeau cournapeau at cslab.kecl.ntt.co.jp
Thu Jul 3 00:14:49 EDT 2008


On Wed, 2008-07-02 at 21:50 -0500, Robert Kern wrote:
> 
> So ... what were you referring to?

To a former email from Matthieu in this thread (or Stefan ?).

> 
> There is special purpose code, yes. We used to use it to load proxy
> objects for scipy subpackages such that "import scipy" would have
> scipy.stats semi-immediately available. We have stopped using it
> because of fragility, confusing behavior at the interpreter, py2exe
> problems, and my general abhorrence of things which mess too deeply
> with imports. It is not a general-purpose solution for lazily-loading
> stdlib modules, I don't think.

I was afraid of something like this.

> 
> > Because we could
> > win between 20 and 40 % time of import by lazily importing a few modules
> > (namely urllib, which I guess it not often used, and already takes
> > around 20-30 ms; inspect and compiler are takinh a long time too, but
> > maybe those are always needed, I have not checked carefully). Maybe this
> > would be complicated to implement for numpy, though.
> 
> These imports could easily be pushed down into the handful of
> functions that need them (with an appropriate comment about why they
> are down there). There is no need to have complicated machinery
> involved.
> 
> Do you have a breakdown of the import costs?

I don't have the precise timings/scripts at the moment, but even by
using really crude method:
	- urllib2 (in numpy.lib._datasource) by itself takes 30 ms from 180ms.
That's an easy 20 % win, since it is not often called.
	- inspect in numpy.lib.utils: this cost around 25 ms

If I just comment the above imports, I go from 180 to 120 ms.

Then, something which takes a awful lot of time is finfo to get floating
points limits. This takes like 30-40 ms. I wonder if there are some ways
to make it faster. After that, there is no obvious spot I remember, but
I can get them tonight when I go back to my lab.

cheers,

David




More information about the NumPy-Discussion mailing list