[Distutils] zc.buildout: download utility vs. local resources
thomas at thomas-lotze.de
Sun Feb 13 11:06:35 CET 2011
I'm currently second-guessing a detail of the zc.buildout's
download utility I wrote last year (*) and I want to ask for
What I question is how the settings for cache usage and download
destination end up determining whether the copy of the resource
that client code receives is private to the buildout, or shared.
The current logic involves an optimisation that tries to avoid
copying files in the file system by creating hard links instead
if possible. (**)
I'd like to make the decision about using a private or shared
copy more explicit and foreseeable. The obvious way to do this is
adding another keyword parameter to the download call that
expresses whether hard-linking should be attempted. I'm however
not at all clear on what would be a sensible default value:
Using a shared copy may not be desirable if client code is going
to modify the file after downloading, so attempting to create
hard links by default is the more dangerous behaviour. Always
copying, on the other hand, means that when using a cache (or
"downloading" a file-system resource) every download creates
multiple copies of potentially large files in the file system,
which is unnecessary in what I think is the majority of cases.
It would be very helpful if someone could offer some input.
(*) The download utility is an API inside zc.buildout that can be
used by recipes or zc.buildout itself to download HTTP or
file-system resources. It can be configured to use a download
cache and to put the downloaded resource in a particular place.
In any case, it returns the file-system path of that copy of the
downloaded resource which is to be used by the client code.
(**) If an HTTP resource is downloaded,
* having neither a cache nor a download destination puts the
resource at a temporary path unique to the current download
* using a cache but no download destination results in accessing
the (shared) file inside the cache,
* if using a download destination but no cache, a private copy of
the resource is put in the destination,
* if using both a cache and a download destination, the utility
tries to hard-link the cached file to the destination and
failing that, copies it, so the result may be either a private
copy or a shared one.
If the resource comes from the file system,
* using no download destination results in shared access to
either the original or cached resource,
* specifying a download destination results in accessing either a
shared or private copy of the resource depending on whether
hard links from its original or cached path to the download
destination are possible.
More information about the Distutils-SIG