[Catalog-sig] Proposal: Move PyPI static data to the cloud for better availability

Tarek Ziadé ziade.tarek at gmail.com
Tue Jun 15 19:02:05 CEST 2010


On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Alexis Métaireau wrote:
>> Hello,
>>
>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>> about making a mirroring infrastructure.
>> Having a infrastructure hosted on a cloud platform may be confortable, and
>> probably needed to have a 24/7 running system, but
>> we need to take care of letting possible the creation of new public mirrors,
>> outside from the Amazon (or whatever) cloud infrastructure.
>
> The proposal doesn't prevent that. However, please note that
> setting up public mirrors not under PSF control has its own
> set of (legal) problems, which the PSF hosted cloud setup avoids.

Mirrors already exists out there, so unless you ban them (which would
be a really bad idea)
setting up a cloud will not fix any legal issue if you think there's a
legal issue.

In any case, you can't prevent people from creating mirrors even if you
would say its illegal. Moreover, having mirrors provided by the community
is way better than relying on one single entity (the PSF) for this.
(if we think "decentralized")

So I think it would be better to focus on PEP 381, and make those
existing mirrors comply with it. And maybe work on the legal issues
you've mentioned


>
>> On Tue, Jun 15, 2010 at 1:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>
>>> PyPI is currently run from a single server hosted in The Netherlands
>>> (ximinez.python.org).  This server is run by a very small team of sys
>>> admin.
>>>
>>
>> As Martin von Löwis said, this already exists. "a.mirrors.pypi.python.org
>>  and b.mirrors.pypi.python.org are already there and could be used by
>> clients". Maybe Martin can you explain us (apologies if this is already done
>> somewhere) how things are working from now ? Is this possible to rely on the
>> existing work rather than using a cloud system ? What's the in place
>> infrastructure ?
>
> In order to use those two servers, you'd still need to implement
> the redirection changes or client side tool changes and, what's
> more important, you'd need to administer and monitor those servers
> 24/7 to achieve similar uptime.

Not at all because the registered mirrors would be in the DNS round robin,
and the clients would just have to switch to another mirror if a mirror
is down. (that's explained in PEP 381)

Such a decentralized system is far more reliable than any centralized
system, and won't cost anything to the PSF.


>
> The latter is what the proposal is all about: we're outsourcing
> the administration and monitoring to a service provider.

Having a better PyPI server is of course a good idea, don't get me wrong.

But it doesn't really solve anything at this point.

A simple, documented protocol, and a list of registered mirrors
backed up by the community is the way to go imho.

And that's what unofficially happened already ! When PyPI is
down, you'll see some tweet messages saying "go to this url, it's my mirror!"

So I would trust the community and finish the PEP and provide a
library that would allow anyone to run a PEP 381-compatible mirror.

Regards
Tarek

-- 
Tarek Ziadé | http://ziade.org


More information about the Catalog-SIG mailing list