[Catalog-sig] simple index and urls exracted from metadata text fields
pje at telecommunity.com
Fri Sep 11 15:32:59 CEST 2009
At 03:13 PM 9/11/2009 +0200, Tarek Ziadé wrote:
>This leads to some problems when scripts like easy_install scans the
>index page: it might try to visit urls the author just put there in
>his description text with no particular intent of making it viewable.
Easy_install only visits pages marked as "home page" links or "download" links.
> Plus, old urls that don't work anymore are not removed, leading to
> easy_install timeouts. 1. what's the purpose of having them in there ?
To allow easy_install to find "dev" links and other identifiable
>2. if there's a purpose, what about adding an attribute to each <a>
>tag to identify from which metadata field it was extracted from ?
The attribute already exists: rel="download" and rel="homepage"; if
there's no 'rel' it's from the description.
I'm rather surprised you don't know these things already, since
they're all rather prominently documented as part of easy_install's
"index API" here:
More information about the Catalog-SIG