[Distutils] RFC: Egg cache fro self-contained buildout

Mohd Kamal Bin Mustafa kamal.mustafa at gmail.com
Tue May 14 04:27:18 CEST 2013


+1

For deployment, I have a script that create a svn tag from trunk, run
buildout and create tarball of the resulting buildout. On the
production, another script will unpack the tarball and rerun buildout
but this time with -N -o and since the eggs already in the tarball,
it's very quick and it just regenerate the script and config file to
reflect the actual production environment. The problem was when
creating the release, as it's a fresh checkout from svn, the eggs dir
not populated yet and due to issue [1], the buildout was very slow.
While I can do something like copy the eggs dir from my current
working directory into the tag dir before running buildout, it will
make the eggs not "pristine" anymore since I might have tampered it
with debugging stuff, it's in my working directory after all.

[1]:https://github.com/buildout/buildout/issues/116

On Tue, May 14, 2013 at 3:08 AM, Jim Fulton <jim at zope.com> wrote:
> Problem
> =======
>
> For (stage and production) deployment purposed, we, ZC, use RPMs.
> It's considered good hygene to produce source RPMs as well as binary
> RPMS.  This led me to create zc.sourcerelease, which automates
> creation of self-contained source tar balls, that, among other
> benefits, provide input for making source RPMs, which feed into a
> process for creating binary RPMs.
>
> We're moving toward a continuous deployment pipeline, where binaries
> are produced early in the development cycle and tested in a controlled
> environment that matches production. It no-longer (never did actually)
> makes sense to produce source RPMs that could be deployed in alternate
> (untested) environments.
>
> In general, our existing build process is grotesquely slow:
>
> - we run a buildout to produce a source release.
>
> - we run it again to build a source (and binary) rpm from the source
>   release.
>
> Both of these run in such a way that all of the eggs have to be
> rebuilt. (But sources don't have to be downloaded.)
>
> We'd like to move toward a model where we construct a build
> environment for each controlled deployment environment.  In this build
> environment, we never want to build a given distribution more than
> once.  We need to produce application binaries that are self
> contained.
>
> Buildout allows you to use a shared eggs directory.  This can greatly
> speed buildouts, because already-built distributions can be found and
> used locally.  However, buildouts that use shared eggs directories
> aren't self contained. They depend on the shared eggs directory.
> I'd like to be able to reuse previously-built eggs, but have eggs
> installed in my local buildout, so it's self contained.
>
> Proposal: egg-cache
> ===================
>
> If egg-cache is set to a directory, then when buildout builds an egg,
> it will copy it to the egg cache. When looking for distributions, it
> will look in the egg cache and, if it finds a matching egg there, it
> will copy the egg to the buildout eggs directory.
>
> The end result will be that an egg cache will have the same economy as
> the current shared eggs directory, as far as building is concerned,
> but it won't have the disk-space saving of a shared eggs directiory.
> It will lead to buildouts that are self contained (at least wrt eggs)
> and that can be copied to a deployment environment directly.
>
> Thoughts?
>
> Jim
>
> --
> Jim Fulton
> http://www.linkedin.com/in/jimfulton
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> http://mail.python.org/mailman/listinfo/distutils-sig


More information about the Distutils-SIG mailing list