[Distutils] EasyInstall 0.3a3 released; what about PyPI? (was Re: Initial auto-installation support)
Phillip J. Eby
pje at telecommunity.com
Tue May 31 03:14:09 CEST 2005
>At 04:10 PM 5/30/2005 -0500, Ian Bicking wrote:
> > But besides
> >that, this should work now for any packages with a distutils install, so
> >long as those packages are reasonably well behaved. Hrm... except
> >setuptools 0.3a2 doesn't have SourceForge download support, but 0.3a3
> >does and I think PJE will release that soon.
0.3a3 is now released, with a new --build-dir option, sandboxing, more
package workarounds, SourceForge mirror selection, and "installation
reports". See:
http://peak.telecommunity.com/DevCenter/EasyInstall#release-notes-change-history
for more details.
I'm thinking that adding automatic package location via PyPI is probably
pretty doable now, by the way. My plan is to create a PackageFinder class
(subclassing AvailableDistributions) whose obtain() method searches for the
desired package on PyPI, keeping a cache of URLs it has already seen. (It
would also accept a callback argument that it would use to create Installer
objects when it needs to install packages.)
The command-line tool (easy_install.main) would create a PackageFinder with
an interactive installation callback, and in the main loop it would pass it
to each new Installer instance. The Installer would then use it whenever
it gets a non-file, non-URL command line option, and use it to resolve()
such requests.
The PackageFinder.obtain() method would go to the PyPI base URL followed by
the desired distribution name, e.g. 'http://www.python.org/pypi/SQLObject',
and then scrape the page to see if it is a multi-version page, or a
single-version page. If it's multi-version, it would scrape the version
links and select the highest-numbered version that meets all of your criteria.
Once it has a single-version page, it would look for a download URL, and
see if its filename is that of an archive (.egg, .tar, .tgz, etc.) or if
the URL is for subversion. If so, we assume it's the right thing and
invoke the callback to do the install.
If not, then we follow the link anyway, and scrape for links to archives,
checking versions when we get there if possible. If there's still nothing
suitable (or there was no download URL), we apply the same procedure to the
homepage URL.
This should suffice to make a significant number of packages available from
PyPI with autodownload, and packages with dependencies would also be
downloaded, built, and installed.
The hardest parts of this aren't in the screen-scraping per se; it's more
in the heuristics for evaluating whether a specific URL is suitable for
download. Many PyPI download URLs are of the form "foopackage-latest.tgz",
so it's not possible to determine a usable version number from this, unless
I special-case "latest" in the version parser -- which I guess I could do.
We also probably need some kind of heuristic to determine which URLs are
"better" to try, as we don't want to just run through the links in order.
Hm. You know, what if as an interim step we had the command-line tool just
launch a webbrowser pointing you to PyPI? Getting to a page for a suitable
version is easy, so we could then let the user find the right download URL
and then go back to paste it on the command line. That could be a nice
interim addition, although it isn't much of a solution for packages with a
lot of un-installed dependencies. You'd keep getting kicked back to the
web browser a lot, and more to the point you'd have to keep restarting the
tool. So, ultimately we really need a way to actually find the URLs.
There are going to have to be new options for the tool, too. Like a way to
set the PyPI URL to use, and a way to specify what sort of package
revisions are acceptable (e.g. no alphas, no betas, no snapshots).
More information about the Distutils-SIG
mailing list