[Distutils] namespace packages

David Cournapeau cournape at gmail.com
Fri Apr 23 09:23:42 CEST 2010


On Fri, Apr 23, 2010 at 2:03 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 10:16 AM 4/23/2010 +0900, David Cournapeau wrote:
>>
>> In my case, it is not even the issue of many eggs (I always install
>> things with --single-version-externally-managed and I forbid any code
>> to write into  easy_install.pth). Importing pkg_resources alone
>> (python -c "import pkg_resources") takes half a second on my netbook.
>
> I find that weird, to say the least.  On my desktop just now, with a
> sys.path 79 entries long (including 41 .eggs), it's a "blink and you missed
> it" operation.  I'm curious what the difference might be.
>
> (Running timeit -s 'import pkg_resources' 'reload(pkg_resources)' gives a
> timing result of 61.9 milliseconds for me.)

I should re-emphasize that the half-second number was on a netbook,
which is a very weak machine on every account (CPU, memory size and
disk capabilities). But using pkg_resources for console_scripts in the
package I am working on made a big difference (more time in spent in
importing pkg_resources than everything else). Since we are talking
about import times, I guess the issue is the same as for namespace
packages. I have noticed this slow behavior on every machine I have
ever had my hands on, be it mine or someone else, on linux, windows or
mac os x.

My (limited) understanding of pkg_resources is that is that it scales
linearly with the number of packages it is aware of, and that it needs
to scan a few directories for every package. Importing pkg_resources
causes many more syscalls than relatively big packages (~ 1000 for
python -c "", 3000 for importing one of numpy/wx/gtk, 6000 for
pkg_resources). Assuming those are unavoidable (and the current
namespace implementation in setuptools requires it, right ?), I don't
see a way to reduce that cost significantly,

cheers,

David


More information about the Distutils-SIG mailing list