[Catalog-sig] simple index and urls exracted from metadata text fields

P.J. Eby pje at telecommunity.com
Fri Sep 11 15:38:29 CEST 2009


At 03:28 PM 9/11/2009 +0200, Martin v. Löwis wrote:
>That's for compatibility with setuptools. When setuptools was parsing
>the original pages, it would follow all URLs.

This is not true; setuptools has never followed all URLs.  It merely 
*parses* all URLs, in order to discover identifiable download links 
(i.e. links to archive files, executables, SVN checkouts, etc.)

It only follows explicit "home page" and "download" links, to do the 
same scanning for potential links.  (In other words, for any given 
package, no more than three pages are scanned: the PyPI index page, 
the homepage, and the download URL, with the latter two only being 
read if they are present and aren't discoverable download links themselves.)

The complete API specification is here, including notes on what was 
done in earlier versions:

    http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api



More information about the Catalog-SIG mailing list