[Distutils] option #1 plus download_url scraping

Carl Meyer carl at oddbird.net
Wed Jun 5 00:43:24 CEST 2013


Hi Barry,

On 06/04/2013 04:16 PM, Barry Warsaw wrote:
> Like many of you, I got Donald's message about the changes to URLs for
> Cheeseshop packages.  My question is about the three options; I think I want a
> middle ground, but I'm interested to see why you will discourage me from that
> <wink>.
> 
> IIUC, option #1 is fine for packages hosted on PyPI.  But what if our packages
> are *also* hosted elsewhere, say for redundancy purposes, and that external
> location needs to be scraped?
> 
> Specifically, say I have a download_url in my setup.py.  I *want* that url to
> be essentially a wildcard or index page because I don't want to have to change
> setup.py every time I make a release (unless of course `setup.py sdist` did it
> for me).  I also can't add this url to the "Additional File URLs" page for my
> package because again I'd have to change it every time I do a release.
> 
> So the middle ground I think I want is: option #1 plus scraping from
> download_url, but only download_url.
> 
> Am I a horrible person for wanting this?  Is there a better way.

The first question, of course, is "why not just host on PyPI"? If
"redundancy" is the real reason, you might think about whether that
reason still applies with the new PyPI infrastructure, CDN, etc.

But let's presume that whatever your reason for hosting off-PyPI, it's a
good one. (To be clear, PEP 438 takes the position that there are and
will continue to be some good reasons, and the option of off-PyPI
hosting - in some form - should be supported indefinitely).

The problem with the current system is that client installer tools do
the scraping whenever they search for installation candidates for your
package, which means that you are asking every user of your package to
accept an unnecessary slowdown every single time they install. But the
information on that download_url page should only change when you make a
release, so the scraping should really be done just once, at release
time, and the resulting sdist URL(s) stored on PyPI so that installers
can take them into account without fetching or scraping any additional
pages.

So the idea is that to satisfy your use-case, there should be a tool
that you can use at release-time to scrape your downloads page and
automatically add sdist URLs found there to the explicit URLs list on
PyPI. That tool, of course, doesn't exist yet :-) Until someone builds
it, you'll have to stay with option #3 (and accept that you are slowing
down installations for your users) to satisfy your use case.

Carl


More information about the Distutils-SIG mailing list