[Distutils] zc.buildout: download utility vs. local resources

Thomas Lotze thomas at thomas-lotze.de
Sun Feb 13 11:06:35 CET 2011

Hi all,

I'm currently second-guessing a detail of the zc.buildout's
download utility I wrote last year (*) and I want to ask for
others' opinions.

What I question is how the settings for cache usage and download
destination end up determining whether the copy of the resource
that client code receives is private to the buildout, or shared.
The current logic involves an optimisation that tries to avoid
copying files in the file system by creating hard links instead
if possible. (**)

I'd like to make the decision about using a private or shared
copy more explicit and foreseeable. The obvious way to do this is
adding another keyword parameter to the download call that
expresses whether hard-linking should be attempted. I'm however
not at all clear on what would be a sensible default value:

Using a shared copy may not be desirable if client code is going
to modify the file after downloading, so attempting to create
hard links by default is the more dangerous behaviour. Always
copying, on the other hand, means that when using a cache (or
"downloading" a file-system resource) every download creates
multiple copies of potentially large files in the file system,
which is unnecessary in what I think is the majority of cases.

It would be very helpful if someone could offer some input.
Thank you.


(*) The download utility is an API inside zc.buildout that can be
used by recipes or zc.buildout itself to download HTTP or
file-system resources. It can be configured to use a download
cache and to put the downloaded resource in a particular place.
In any case, it returns the file-system path of that copy of the
downloaded resource which is to be used by the client code.

(**) If an HTTP resource is downloaded,

* having neither a cache nor a download destination puts the
  resource at a temporary path unique to the current download

* using a cache but no download destination results in accessing
  the (shared) file inside the cache,

* if using a download destination but no cache, a private copy of
  the resource is put in the destination,

* if using both a cache and a download destination, the utility
  tries to hard-link the cached file to the destination and
  failing that, copies it, so the result may be either a private
  copy or a shared one.

If the resource comes from the file system,

* using no download destination results in shared access to
  either the original or cached resource,

* specifying a download destination results in accessing either a
  shared or private copy of the resource depending on whether
  hard links from its original or cached path to the download
  destination are possible.

More information about the Distutils-SIG mailing list