[Catalog-sig] Why so many zc.buildout versions?
Jim Fulton
jim at zope.com
Tue Jul 10 16:32:10 CEST 2007
You raise a really good point, which is especially relevant in light
of pypi performance issues and discussions.
I'm copying the distutils and catalog sigs to get some wider
discussion. I apologize for the cross posting.
I'm beginning to wonder about the strategy that setuptools uses, or
maybe about the way we are using the index.
It's important to note that there is nothing specific about the
buildout package here.
It is very important to make multiple versions available to support
requirements for specific package versions. It make builds/installs
repeatable, whether talking about buildout or other systems built on
setuptools. When someone has tested and wants to release an
application built from a collection of distributions, they will want
to specify those *specific* versions for future builds or installs.
This means that we need to retain any versions published indefinitely
in a way that can be found by setuptools.
Currently, the only way to support multiple versions with the
cheeseshop is to unhide past releases. This has a fairly severe
effect on performance. As the example below shows, setuptools will
fetch the package page and then fetch the pages for each release.
That's a lot of requests. What makes it worse is that the individual
package pages can be fairly long. I've gotten in the habit of
including full documentation on every release page. For example,
recent release pages for zc.buildout are around 200K. This is a
fairly significant amount of data to transfer. This will certainly
make the scanning process take a long time for clients. (Obviously,
if we keep doing things the way we are, I'll need to stop doing that.)
All of this aggravates any performance problems we might have.
Up to now, setuptools has tried hard to use existing systems without
change. This means that it reuses systems designed primarily for
people, not software. I think that setuptools rightly took the
approach it has up to now so that progress could be made without
making people change other systems. This was appropriate when
setuptools was evolving and people were figuring out ways to use it.
I think it is time to take a step back and think a lot harder about
how we'd want to structure an index to support setuptools.
IMO, a setuptools-aware index would have a single page for each package:
- The single page would be published in a case-insensitive way. It
would be nice to find a way to avoid this, or maybe we should use a
windows-based web server. :) It would also be served very cheaply,
for example statically.
- The single page would list links for all available distributions,
which should include all distributions published. It would also list
any other URLs that should be scanned for releases, when releases
aren't all uploaded to PyPI.
- The single page would contain very little additional information.
It would be for use by software, not humans.
In addition, the root page with a trailing / would be empty and very
cheap.
There are a lot of ways we could achieve this pretty cheaply while
keeping the existing system pretty much as it is.
For example, the current effort to bake static pages could bake these
pages instead. We could make the new index available at a different
URL for people to play with while we worked the kinks out of the
process.
Of course, those of us who use the cheesehop and setuptools
extensively can also achieve much of this by changing the way we work.
Thoughts?
Jim
On Jul 10, 2007, at 8:44 AM, Philipp von Weitershausen wrote:
> When easy_installing zc.buildout I realized that the CheeseShop
> still lists a gazillion old versions of zc.buildout. That makes it
> take quite some time to install zc.buildout (see below), and I
> reckon the same sort of check has to happen each time it looks for
> a new version of that egg...
>
> Is there any reason for having so many old versions around?
>
>
> $ easy_install zc.buildout
> Searching for zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
> Reading http://svn.zope.org/zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
> Best match: zc.buildout 1.0.0b28
> ...
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
More information about the Catalog-SIG
mailing list