[Catalog-sig] V4 Pre-PEP: transition to release-file hosting on PYPI

PJ Eby pje at telecommunity.com
Fri Mar 15 17:51:11 CET 2013

On Fri, Mar 15, 2013 at 12:07 PM, Carl Meyer <carl at oddbird.net> wrote:
> On 03/15/2013 09:15 AM, PJ Eby wrote:
>> Do we even need the internal/external rel info?  I was planning to
>> just use the URL hostname.
>> i.e., are there any use cases for designating an externally-hosted
>> file internal, or an internally-hosted file external?  If not, it
>> seems the rel="" is redundant.
> Right; Donald and Holger already gave the rationale for this: there are
> good reasons for an index to not have "internal" links actually on the
> exact same hostname. Even just using a different subdomain would break
> simple host comparison.
>> It's also more work to implement, vs. just defaulting --allow-hosts to
>> be the --index-url host; a strategy ISTM pip could also use, since it
>> has the same two options available.
> Pip actually doesn't currently have --allow-hosts, although there's no
> good reason for that; it ought to.
>> Also, if we're not doing homepage/download crawling any more, I was
>> hoping we could just drop the code that 'parses' rel="" links in the
>> first place, as it's an awkward ugly hack.  ;-)
> Well, parsing HTML links as an API is an ugly hack, but within that
> existing framework "rel" seems like the appropriate semantic attribute
> for this type of information, not really upping the hackiness quotient :-)

Well, to be clear, I liked previous versions of the proposal better
than this one.  But while I *really* don't want to do any new rel
parsing, that's not the only or even the most important reason.

The main reason is that I think internal vs. external is a bogus
distinction: what's important (IMO) is what hosts you do and don't
trust.  Giving a blanket pass to all external links doesn't seem like
such a good idea to me, nor does allowing the index to define what
hosts the client should trust.   As for the internal ones, I'm not
sure why we can't at least make a subdomain requirement, or have users
explicitly add a PyPI CDN to their configured --allow-hosts.

To try to put it another way: there should be one, and preferably only
one, obvious way to specify where you get downloads from.  That way in
easy_install is currently --allow-hosts.  Adding new options that
interact and overlap with that looks like bad UI design to me,
increasing the possibility of user confusion.

More information about the Catalog-SIG mailing list