[Numpy-discussion] "import numpy" is slow

Fri Aug 1 15:46:41 EDT 2008

On Sat, Aug 2, 2008 at 1:18 AM, Christopher Barker
<Chris.Barker at noaa.gov> wrote:
>
> A lot! 41 entries, and lot's of eggs -- are eggs an issue? I'm also
> wondering how the order is determined -- if it looked in site-packages
> first, it would find numpy a whole lot faster.

I don't think the number itself  is an issue. Putting eggs first is
the way it has to be I think, that's just how eggs are supposed to
work.

> I also tried:
>
> python -v -v -c "import numpy" &>junk2.txt
>
> which results in:
>
> # installing zipimport hook
> import zipimport # builtin
> # installed zipimport hook
> # trying
> /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site.so
> # trying
> /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/sitemodule.so
> # trying
> /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site.py
>
> ...
> ...
>
> And a LOT more:
>
> $ grep "# trying" junk2.txt | wc -l
>     7446
>
> For comaprison:
> $ python -v -v -c "import sys" &>junk3.txt
> $ grep "# trying" junk3.txt | wc -l
>      618
>
> which still seems like a lot.
>
> So I think I've found the problem, it's looking in 7446 places ! but why?

Part of it is how python looks for modules. Again, I don't think the
number itself is the issue: non existing files should not impact much
because python import is basically doing a stat, and a stat on a non
existing file, in the hot situation, takes nothing.

IOW, I don't think the problem is the numbers themselves. It has to be
something else. A simple profiling like

python -m cProfile -o foo.stats foo.py

and then:

python -c "import pstats; p = pstats.Stats("foo.stats");
p.sort_stats('cumulative').print_stats(50)"

May give useful information. This and using shark as Robert suggested
should point to some direction,

cheers,

David