[Distutils] buildout: several fold performance increases

Ross Patterson me at rpatterson.net
Tue Jan 17 03:45:14 CET 2012


I've long been perplexed by how long a buildout takes to run with
multiple parts whose required distributions are largely similar.  Taking
a stab at it, I found two hot spots that yield several fold improvements
in performance.

First, zc.buildout.easy_install._log_requirements was doing expensive
requirements parsing and sorting even when no message would be logged.
I committed a fix for it that on a 10 part buildout with a large "eggs"
option for each part decreased update time from a cProfile run time of
93 seconds to 15 seconds:

http://svn.zope.org/zc.buildout/trunk/src/zc/buildout/easy_install.py?rev=124059&r1=122980&r2=124059

Secondly, instantiating pkg_resources.Environment, including the
setuptools.package_index.PackageIndex subclass, is very expensive and
was being done multiple times for any given part, and was being done for
parts whose environments were identical.  There was some existing global
caching for package indexes that I've duplicated for environments in the
attached patch.

Unfortunately, I haven't been able to get a clean test environment for
the life of me.  I'm using a clean Python 2.7 build from source, turning
everything in ~/.buildout/default.cfg off, and running tests in a clean
checkout of the zc.buildout/trunk buildout.  Even under those conditions
I get 17 failing tests before any changes.  With this environments
cache, I see 41 failures, but I can't make sense of it.  This patch
yields another 2-3 fold decrease to 6 seconds for the same buildout and
is driven by profiling data, not guessing.  Can someone help me get this
patch in?

Finally, it would be great to see releases of zc.buildout with these
performance improvements get out in the world.  I've been hearing more
and more complaints about buildout run times and these are easy fixes.
If we can get the second, attached patch in quickly, then I'd say we
should release with both.  If not, then it's still worth it to cut a
release for the first, already committed patch, which yields the
greatest improvement.

Thanks!
Ross

-------------- next part --------------
A non-text attachment was scrubbed...
Name: envs.diff
Type: text/x-diff
Size: 3574 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20120116/6c16f531/attachment.diff>


More information about the Distutils-SIG mailing list