[Distutils] short circuiting module lookups

Noah Gift noah.gift at gmail.com
Wed Apr 8 06:25:09 CEST 2009


On Wed, Apr 8, 2009 at 6:55 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 02:23 PM 4/7/2009 -0400, Jim Fulton wrote:
>
>> On Apr 7, 2009, at 9:28 AM, P.J. Eby wrote:
>>
>>> At 11:54 PM 4/7/2009 +1200, Noah Gift wrote:
>>>>
>>>> 1.  In the case of entry points for setuptools, it actually recurses
>>>> into EVERY egg directory in your path, not just the egg you
>>>> requested,
>>>> adds them to your sys.path and additionally looks for four files
>>>> inside of every egg.  On a laptop on local storage, this doesn't
>>>> matter, but when thousands of machines hit the same filer, with many
>>>> python processes, bad things happen...
>>>
>>> Install your eggs with --multi-version, and then only the eggs that
>>> are required for the running script will be added to sys.path or
>>> have their contents opened.  (Installing them as zip files rather
>>> than directories may also speed this up.)
>>
>>
>> My experience on Linux is that installing eggs as Zip files slows
>> imports.
>
> In general, perhaps.  But if they're not actually *on* sys.path, as I
> proposed above, then it should not slow down all imports, and instead should
> speed up the entry point lookups.  Were your tests using --multi-version
> install (i.e., eggs not on sys.path)?
>

Thanks for info, I was not aware of --multi-version.  I have a very
unusual situation so I may not be able to handle normal use cases very
well.  For one simple test, I manually crafted sys.path and was able
to get the following speed improvement, based on using the time
command and strace.  This is probably not what a lot of people want,
but it is interesting to see this is possible for people in my
situation that need raw speed over any possible flexibility.


Total Elapsed Time:  2066 % speed improvement
Lines of strace output:  3050| 1695 % reduction in calls to file system



-- 
Cheers,

Noah


More information about the Distutils-SIG mailing list