
At 04:10 PM 5/30/2005 -0500, Ian Bicking wrote:
But besides that, this should work now for any packages with a distutils install, so long as those packages are reasonably well behaved. Hrm... except setuptools 0.3a2 doesn't have SourceForge download support, but 0.3a3 does and I think PJE will release that soon.
0.3a3 is now released, with a new --build-dir option, sandboxing, more package workarounds, SourceForge mirror selection, and "installation reports". See: http://peak.telecommunity.com/DevCenter/EasyInstall#release-notes-change-his... for more details. I'm thinking that adding automatic package location via PyPI is probably pretty doable now, by the way. My plan is to create a PackageFinder class (subclassing AvailableDistributions) whose obtain() method searches for the desired package on PyPI, keeping a cache of URLs it has already seen. (It would also accept a callback argument that it would use to create Installer objects when it needs to install packages.) The command-line tool (easy_install.main) would create a PackageFinder with an interactive installation callback, and in the main loop it would pass it to each new Installer instance. The Installer would then use it whenever it gets a non-file, non-URL command line option, and use it to resolve() such requests. The PackageFinder.obtain() method would go to the PyPI base URL followed by the desired distribution name, e.g. 'http://www.python.org/pypi/SQLObject', and then scrape the page to see if it is a multi-version page, or a single-version page. If it's multi-version, it would scrape the version links and select the highest-numbered version that meets all of your criteria. Once it has a single-version page, it would look for a download URL, and see if its filename is that of an archive (.egg, .tar, .tgz, etc.) or if the URL is for subversion. If so, we assume it's the right thing and invoke the callback to do the install. If not, then we follow the link anyway, and scrape for links to archives, checking versions when we get there if possible. If there's still nothing suitable (or there was no download URL), we apply the same procedure to the homepage URL. This should suffice to make a significant number of packages available from PyPI with autodownload, and packages with dependencies would also be downloaded, built, and installed. The hardest parts of this aren't in the screen-scraping per se; it's more in the heuristics for evaluating whether a specific URL is suitable for download. Many PyPI download URLs are of the form "foopackage-latest.tgz", so it's not possible to determine a usable version number from this, unless I special-case "latest" in the version parser -- which I guess I could do. We also probably need some kind of heuristic to determine which URLs are "better" to try, as we don't want to just run through the links in order. Hm. You know, what if as an interim step we had the command-line tool just launch a webbrowser pointing you to PyPI? Getting to a page for a suitable version is easy, so we could then let the user find the right download URL and then go back to paste it on the command line. That could be a nice interim addition, although it isn't much of a solution for packages with a lot of un-installed dependencies. You'd keep getting kicked back to the web browser a lot, and more to the point you'd have to keep restarting the tool. So, ultimately we really need a way to actually find the URLs. There are going to have to be new options for the tool, too. Like a way to set the PyPI URL to use, and a way to specify what sort of package revisions are acceptable (e.g. no alphas, no betas, no snapshots).