[Catalog-sig] Proposal: Move PyPI static data to the cloud for better availability

Tue Jun 15 23:24:23 CEST 2010

>> That's not at all accurate: PEP 381 is almost completely implemented
>> in the mirroring tools.
>
> Which parts of PEP 381 are implemented ?

For the mirrors themselves: everything except for the propagation of
download counters.

> It's important not to require changes on the client side.

I disagree. It's the only way to provide reliably protection against 
server failures. The client code must initiate the fallback, e.g. after 
a timeout.

>> For normal operation (i.e. on the master copy), this would be really
>> insufficient. Users expect, in automated build processes, that the
>> packages they upload are available for *immediate* download.
>
> Power users and developers will probably want that, but those
> can hook up to the PyPI server directly if they have such a
> need.

Under your proposal, how precisely would they do that?

>> There is a good chance that, before that proposal is implemented,
>> the PEP 381 implementation is completed.
>
> Including getting all client side package tools updated and
> deployed to the existing users ?

That depends on how long the proposal requires to get implemented.

However, I don't think it is necessary to have the tools updated
and deployed to all existing users. Instead, it is sufficient that
people who worry about server outages get the tools deployed; for
this, the answer is "yes".

>> Not sure why you wouldn't push every change immediately to the CDN, though.
>
> The proposal wants to do without changing PyPI code where
> possible.

-1000. What's the rationale for not modifying PyPI code?

Are you, by any chance, proposing that this CDN propagation tool does a 
full PyPI traversal every 20 minutes???

> While this would be good to have and provide a better
> user experience, it's not required. The user would just need
> to restart the command and then get a new server IP address
> to try - just like you do in a web browser if a page doesn't
> load. That's still a lot better than not being able to download
> anything at all.

I think this depends a lot on the client setup. For example, on
my machine, I don't get a different IP address for www.google.com
each time, using the DNS server in my Fritzbox router.

> The mirror PEP shares this problem with the cloud proposal.

Except that it gives the client the explicit choice which copy to get 
the data from.

Regards,
Martin