[Catalog-sig] Proposal: Move PyPI static data to the cloud for better availability

"Martin v. Löwis" martin at v.loewis.de
Tue Jun 15 23:24:23 CEST 2010

>> That's not at all accurate: PEP 381 is almost completely implemented
>> in the mirroring tools.
> Which parts of PEP 381 are implemented ?

For the mirrors themselves: everything except for the propagation of
download counters.

> It's important not to require changes on the client side.

I disagree. It's the only way to provide reliably protection against 
server failures. The client code must initiate the fallback, e.g. after 
a timeout.

>> For normal operation (i.e. on the master copy), this would be really
>> insufficient. Users expect, in automated build processes, that the
>> packages they upload are available for *immediate* download.
> Power users and developers will probably want that, but those
> can hook up to the PyPI server directly if they have such a
> need.

Under your proposal, how precisely would they do that?

>> There is a good chance that, before that proposal is implemented,
>> the PEP 381 implementation is completed.
> Including getting all client side package tools updated and
> deployed to the existing users ?

That depends on how long the proposal requires to get implemented.

However, I don't think it is necessary to have the tools updated
and deployed to all existing users. Instead, it is sufficient that
people who worry about server outages get the tools deployed; for
this, the answer is "yes".

>> Not sure why you wouldn't push every change immediately to the CDN, though.
> The proposal wants to do without changing PyPI code where
> possible.

-1000. What's the rationale for not modifying PyPI code?

Are you, by any chance, proposing that this CDN propagation tool does a 
full PyPI traversal every 20 minutes???

> While this would be good to have and provide a better
> user experience, it's not required. The user would just need
> to restart the command and then get a new server IP address
> to try - just like you do in a web browser if a page doesn't
> load. That's still a lot better than not being able to download
> anything at all.

I think this depends a lot on the client setup. For example, on
my machine, I don't get a different IP address for www.google.com
each time, using the DNS server in my Fritzbox router.

> The mirror PEP shares this problem with the cloud proposal.

Except that it gives the client the explicit choice which copy to get 
the data from.


More information about the Catalog-SIG mailing list