[Distutils] idea for Distribute: make unzipped eggs be the default

P.J. Eby pje at telecommunity.com
Tue Jul 28 19:15:17 CEST 2009


At 08:23 AM 7/28/2009 -0600, Zooko Wilcox-O'Hearn wrote:
>I think it might improve performance a bit.
>(Yes, that's right -- we haven't done a real measurement of
>performance, but the few times that people briefly glanced at
>performance it seemed like zipping the eggs made them slower to load,
>not faster.)

You're trading startup time for import time, actually.  Unzipped eggs 
force OS level stat calls during importing of every module, whereas 
zipped eggs force directory reads at startup.

In both cases, the time cost mostly occurs at startup; the difference 
is in whether you have more imports or more eggs.  Each zipped egg 
adds a one-time performance hit; each unzipped egg adds a smaller -- 
but *N-times* as often -- performance hit.

So the optimum performance tradeoff depends on how many imports you 
have *and* how many eggs you have on sys.path.  If you have lots of 
eggs and few imports, unzipped ones will probably be faster.  If you 
have lots of eggs and *lots* of imports, zipped ones will probably be faster.

To really know what the tradeoff is, some actual measurements are 
needed.  (And they need to measure both startup overhead and import overhead.)



More information about the Distutils-SIG mailing list