[Catalog-sig] homepage/download metadata cleaning

holger krekel holger at merlinux.eu
Fri Mar 1 11:19:56 CET 2013

Hi Richard, all,

somewhere deep in the threads i mentioned i wrote a little "cleanpypi.py"
script which takes a project name as an argument and then goes to 
pypi.python.org and removes all homepage/download metadata entries for 
this project.  This sanitizes/speeds up installation because
pip/easy_install don't need to crawl them anymore.  I just did this for
three of my projects, (pytest, tox and py) and it seems to work fine.

Now before i release this as a tool, i wonder: Is it a good idea to remove
download/homepage entries?  Is there any current machine use (other than
the dreaded crawling) for the homepage/download_url per-release metadata 

For humans the homepage link is nicely discoverable if the long-description
doesn't mention it prominently.  But i think there also is a "project url" 
or "bugtrack url" for a project so maybe those could be used to reference 
these important pages?  (i am a bit confused on the exact meaning of those
urls, btw).

Should we maybe stop advertising "homepage" and "download_url"
and instead see to extend project-url/bugtrackurl to be used
and shown nicely? The latter are independent of releases which i think
makes sense - what use are old probably unreachable/borked homepages
anyway.  And it's also not too bad having to go once to pypi.python.org
to set it, usually it seldomly changes.


More information about the Catalog-SIG mailing list