[Catalog-sig] Proposal: Move PyPI static data to the cloud for better availability
exarkun at twistedmatrix.com
exarkun at twistedmatrix.com
Fri Jun 18 23:47:00 CEST 2010
On 09:39 pm, ziade.tarek at gmail.com wrote:
>On Thu, Jun 17, 2010 at 6:30 AM, Ian Bicking <ianb at colorstudy.com>
>>On Wed, Jun 16, 2010 at 1:37 PM, "Martin v. Löwis"
>><martin at v.loewis.de>
>>>>It is likely that some people will setup a mirror and then "forget"
>>>>about it. Like our buildbots really.
>>>The same can happen to any infrastructure, though. Amazon may decide
>>>change the setup, and then the automated update procedure would
>>>Of course, they would give advance notice - but then somebody would
>>>have to react to that advance notice.
>>That's not very likely, and if something does change it will be
>>well announced and documented. Amazon is providing a commercial
>>lots of people rely on, their process is formalized and
>>And if Amazon makes mistakes they'll figure out how to avoid them next
>>while mirror providers are a rotating crew that is unlikely to easily
>>reliably learn from past mistakes.
>if a mirror manager don't do a good job, he'll just be taken out of
>the ring after a while.
>If we depend 100% on Amazon, and if there's a problem, the mirroring
>will be down for the time being and we won't be able to do nothing
>>If we actually understood each time PyPI
>>broke and fixed it none of this would be a problem; I'm not blaming
>>for that, but it's also not going to change and adding lots of mirror
>>systems just adds more systems with exactly the same management
>>that our current system has.
>Yes but the difference is that you don't put all your eggs in the same
>it's very unlikely that ALL community mirrors will be down at the same
>a fall-back mechanism on the client side will raise the availability
>About Amazon: what will happen in 5 years with their offer ? will our
>Cloud-PyPI infrastructure will still work ? what will be the workload
>to maintain it ? You can't
>be 100% sure the Python community will be able to dedicate that time.
>PyPI works today because it is not forced by a third party to evolve,
>it can evolve as its own pace.
>On the contrary, once the mirrors system is set, it will be dead easy
>to add/remove a mirror in the ring, and each node won't act as a SPOF
>IMHO it's a bad idea to make this piece of our infrastructure depend
>on one third party commercial entity, where we can provide a community
There are (multiple!) open source implementations of the Amazon API. If
Amazon decides to discontinue their cloud services (something I doubt
should really be one of the top ten concerns here), then anyone else can
set up their own cloud with the same interface.
If I were going to run a PyPI mirroring service, I'd probably want to do
it this way *anyway* because managing virtual machines is far easier
than managing actual hardware.
So there are probably many other much more significant issues to be
More information about the Catalog-SIG