[Catalog-sig] simple index and urls exracted from metadata text fields

"Martin v. Löwis" martin at v.loewis.de
Fri Sep 11 15:28:53 CEST 2009


> Right now in a package registered at pypi, there are no distinction
> between urls located in free text metadata (like description)
> and metadata that are supposed to be urls.

I presume you are talking about the simple API?

Otherwise, I cannot understand your question: in the database, and in
the metadata, there is clearly such distinction, also accessible to any
clients who care to use it.

> Plus, old urls that don't work anymore are not removed, leading to
> easy_install timeouts.
> 
> 1. what's the purpose of having them in there ?

Again, I presume you are talking about the simple API?

That's for compatibility with setuptools. When setuptools was parsing
the original pages, it would follow all URLs. In order to preserve
full compatibility in the simple pages, all URLs had to be extracted;
this was the specification given by Jim Fulton.

> 2. if there's a purpose, what about adding an attribute to each <a>
> tag to identify from which metadata field it was extracted from ?

Just make a specification. Notice that some links *already* have
a rel attribute which might be interesting to consider.

Regards,
Martin


More information about the Catalog-SIG mailing list