[Catalog-sig] Proposal: Move PyPI static data to the cloud for better availability

Tarek Ziadé ziade.tarek at gmail.com
Tue Jun 15 19:24:29 CEST 2010

On Tue, Jun 15, 2010 at 7:15 PM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> On 15 Jun, 2010, at 19:02, Tarek Ziadé wrote:
>> On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>> Alexis Métaireau wrote:
>>>> Hello,
>>>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>>>> about making a mirroring infrastructure.
>>>> Having a infrastructure hosted on a cloud platform may be confortable, and
>>>> probably needed to have a 24/7 running system, but
>>>> we need to take care of letting possible the creation of new public mirrors,
>>>> outside from the Amazon (or whatever) cloud infrastructure.
>>> The proposal doesn't prevent that. However, please note that
>>> setting up public mirrors not under PSF control has its own
>>> set of (legal) problems, which the PSF hosted cloud setup avoids.
>> Mirrors already exists out there, so unless you ban them (which would
>> be a really bad idea)
>> setting up a cloud will not fix any legal issue if you think there's a
>> legal issue.
>> In any case, you can't prevent people from creating mirrors even if you
>> would say its illegal. Moreover, having mirrors provided by the community
>> is way better than relying on one single entity (the PSF) for this.
>> (if we think "decentralized")
> Why is having community mirrors better than one managed by the PSF?

Because it's not controlled anymore by one single entity. For example,
if something is broken in the system
and need a human intervention, and the sysadmin people are not
available, we get a downtime.

Lots of mirrors back by more people in the community greatly reduces
this problem

> Even with community mirrors the contents of PyPI are still controlled by the PSF, because they control the master server, there is not much decentralization in that respect.

Once the DNS is set to accept other servers, the PyPI 'main' server is
just the master that gets the content first which is then replicated.

So, yes, the PSF controls the DNS, but will not control the
downtime/uptime issues anymore.

> AFAIK the goal of this exercise is to improve the uptime of the PyPI download service as used by existing installation, MAL's proposal seems like an easy way to accomplish that with minimal effort.

Again, mirrors already exists out there. and they are getting updated
every day. We are not far from what we want. So after more thoughts, I
really don't think the cloud thing will
be a minimal effort.

> Ronald

Tarek Ziadé | http://ziade.org

More information about the Catalog-SIG mailing list