On Tue, Sep 27, 2011 at 8:40 AM, Tarek Ziadé
On Tue, Sep 27, 2011 at 2:27 PM, Jim Fulton
wrote: ... I understand where you're coming from but, ..
Sorry, I don't understand what you imply here.
I understand why you don't want to rely on a proprietary solution.
I think it's saner to rely on proven technology than to invent our own protocol. NIH?
Ah sorry I misunderstood then. I thought CloudFront was a proprietary platform, with its own protocol.
It's a reverse proxy. You point it at s3 and at a web server and it caches. Of course, it has aspects that are specific to it's implementation.
If you're saying that we can move away from CloudFront at any time and have the same feature elsewhere, then it's perfect.
If we move to something else, *some* changes will be necessary, but we can certainly move. I agree, it's perfect. ;)
If you're saying that CloudFront is proven technology and that we should not worry about relying on them, then I think we can do better for the community to get locked-in for this, and continue to work on an open protocol where everyone can participate by providing a spare server. But maybe that's just me ?
It's nice to have a hobby. :) But I don't want to have to update buildout *just* because of an itch to have a custom protocol.
Most of the mirroring protocol was inspired by Perl's CPAN btw.
<shrug>
But the use case is usually: PyPI is down, we fallback to a mirror. I don't think it's more complicated than this.
I don't agree. On multiple levels. PYPI is often up but slow. It's also in the wrong place. A CDN should provide better performance, reliability and locality. A client has to: - try pypi - fallback to "last" - If that's down, decide what other indexes to check I don't see how having timestamps help unless you know what the current timestamp is, unless you say that you'll reject a mirror with a timestamp more than some period in the past. It's not clear what this time delta should be and, in any case, the client needs to first validate a mirror by checking it's timestamp. I think this protocol is going to be hard to get right.
- It either requires extra dns calls or relies to heavily on the last mirror, which is probably likely to be the least reliable.
Once you have the list, I don't think you require extra call.
see http://hg.python.org/cpython/file/84280fac98b9/Lib/packaging/pypi/mirrors.py
It has to make extra dns calls to resolve the other mirror names to ips.
Life is short. We don't have to invent this ourselves.
Ah well, yeah -- Not sure what you are proposing right now.
If you imply that everything should be solved on server-side, and that we should not have mirroring
I think we should pick a good CDN and use it. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton