[Distutils] Fwd: The state of PyPI
jim at zope.com
Tue Sep 27 17:35:49 CEST 2011
On Tue, Sep 27, 2011 at 8:40 AM, Tarek Ziadé <ziade.tarek at gmail.com> wrote:
> On Tue, Sep 27, 2011 at 2:27 PM, Jim Fulton <jim at zope.com> wrote:
>> I understand where you're coming from but, ..
> Sorry, I don't understand what you imply here.
I understand why you don't want to rely on a proprietary solution.
>> I think it's saner to rely on proven technology
>> than to invent our own protocol. NIH?
> Ah sorry I misunderstood then. I thought CloudFront was a proprietary
> platform, with its own protocol.
It's a reverse proxy. You point it at s3 and at a web server and it caches.
Of course, it has aspects that are specific to it's implementation.
> If you're saying that we can move away from CloudFront at any time and
> have the same feature elsewhere, then it's perfect.
If we move to something else, *some* changes will be necessary,
but we can certainly move. I agree, it's perfect. ;)
> If you're saying that CloudFront is proven technology and that we
> should not worry about relying on them, then I think we can do better
> for the community to get locked-in for this, and continue to work on
> an open protocol where everyone can participate by providing a spare
> server. But maybe that's just me ?
It's nice to have a hobby. :)
But I don't want to have to update buildout *just* because of an itch
to have a custom protocol.
> Most of the mirroring protocol was inspired by Perl's CPAN btw.
> But the use case is usually: PyPI is down, we fallback to a mirror. I
> don't think it's more complicated than this.
I don't agree. On multiple levels. PYPI is often up but slow.
It's also in the wrong place. A CDN should provide better performance,
reliability and locality.
A client has to:
- try pypi
- fallback to "last"
- If that's down, decide what other indexes to check
I don't see how having timestamps help unless you know
what the current timestamp is, unless you say that you'll reject
a mirror with a timestamp more than some period in the past.
It's not clear what this time delta should be and, in any case,
the client needs to first validate a mirror by checking it's timestamp.
I think this protocol is going to be hard to get right.
>> - It either requires extra dns calls or relies to heavily on the last
>> mirror, which is probably likely
>> to be the least reliable.
> Once you have the list, I don't think you require extra call.
> see http://hg.python.org/cpython/file/84280fac98b9/Lib/packaging/pypi/mirrors.py
It has to make extra dns calls to resolve the other mirror names to ips.
>> Life is short. We don't have to invent this ourselves.
> Ah well, yeah -- Not sure what you are proposing right now.
> If you imply that everything should be solved on server-side, and that
> we should not have mirroring
I think we should pick a good CDN and use it.
More information about the Distutils-SIG