On Sat, Jan 25, 2014 at 3:38 PM, Donald Stufft <donald@stufft.io> wrote:
Today (Sat Jan 25, 2014) the Infrastructure team has migrated PyPI to new infrastructure.
The old infrastructure was:
- a single database server managed by OSUOSL - a pair of load balancers shared by all of the python.org services hosted on OSUOSL - a single backend VM that served as everything else for PyPI
The VM that was acting as the backend server from PyPI was partially hand configured and partially setup to be managed by chef. Additionally it had an issue that caused it to kernel panic every so often which had been the cause of a number of downtimes in the last few months. Because it was primarily configured and administered by hand and because the way it was set up it was not feasible to have any sort of failover or spare.
The new infrastructure is:
- 2 Web VMs - 2 Database servers in Master/Slave Configuration - 2 PgPool Servers pooling connections to the database servers and load balancing reads across them. - 2 GlusterFS servers backed by Cloud Block Storage acting as the file storage for package and package docs - 1 metrics server to handle updating the download counts as they come in from Fastly
All of the VMs are hosted on Rackspace’s Public Cloud and have their configuration completely controlled and managed using Salt. Going forward this
Can you say a little about the choice to use Salt instead of Chef? I don't really care either way, but am just curious. Is it because Salt is written in Python, or were there other reasons (functionality, etc)? --Chris
will allow us to easily scale out as required or kill malfunctioning servers and spin up new ones easily. Additionally the setup has been setup so that where possible there is two servers performing the same role, ideally in an Active/Active configuration but at least in a Master/Slave configuration. This should allow PyPI to be far more stable moving forward and make downtimes much easier to recover from.
The services are still fronted by Fasty’s CDN and in the new infrastructure we’ve removed our load balancer and have replaced it by having Fastly handle the load balancing for us. Additionally we’ve recently setup a static mirror of PyPI that is updated once every minute. This is hosted on Rackspace cloud as well but in a separate data center from the rest of PyPI. Fastly is configured to fall back to this static mirror in the case that neither of the two web heads are functioning. This should ensure that even in the event of a catastrophic failure of the PyPI service that the bulk of package installations should hopefully remain working.
The bad news, (and the “Breakage” from the subject) is that while the new infrastructure was being planned out, built, and migrated to the “pypissh” package was forgotten. The pypissh package is an alternative way to upload packages to PyPI however it is very difficult, because of the way it works, to provide HA support for it as we’ve set up for everything else. We don’t have any numbers for how many people are actively using this package but looking at a roughly 2 week chunk of time in PyPI’s download history, the pypissh package was downloaded 7 times by a browser, and 7 times by pip. All other downloads were caused by the mirroring system.
As of right now pypissh is non functional and due to the difficulty in HAing and monitoring the current setup and because it is apparently has a very small set of users we would like to effectively kill off this particular service. Additionally the benefits of pypissh have been reduced now that PyPI is available over a TLS connection with a well trusted certificate. My question to you is, is this something that distutils-sig is willing to have happen? If we are to re-enable pypissh we’ll need to write a new solution to doing it that can be properly HA’d and we’d prefer to put our efforts into improving things for a much larger set of people.
So yea, PyPI should be loads more stable and more reliable now.
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig