[Distutils] PyPI Migrated to New Infrastructure with some Breakage
donald at stufft.io
Sun Jan 26 00:38:53 CET 2014
Today (Sat Jan 25, 2014) the Infrastructure team has migrated PyPI to new
The old infrastructure was:
- a single database server managed by OSUOSL
- a pair of load balancers shared by all of the python.org services hosted on
- a single backend VM that served as everything else for PyPI
The VM that was acting as the backend server from PyPI was partially hand
configured and partially setup to be managed by chef. Additionally it had an
issue that caused it to kernel panic every so often which had been the cause of
a number of downtimes in the last few months. Because it was primarily
configured and administered by hand and because the way it was set up it was
not feasible to have any sort of failover or spare.
The new infrastructure is:
- 2 Web VMs
- 2 Database servers in Master/Slave Configuration
- 2 PgPool Servers pooling connections to the database servers and load
balancing reads across them.
- 2 GlusterFS servers backed by Cloud Block Storage acting as the file storage
for package and package docs
- 1 metrics server to handle updating the download counts as they come in from
All of the VMs are hosted on Rackspace’s Public Cloud and have their
configuration completely controlled and managed using Salt. Going forward this
will allow us to easily scale out as required or kill malfunctioning servers
and spin up new ones easily. Additionally the setup has been setup so that
where possible there is two servers performing the same role, ideally in an
Active/Active configuration but at least in a Master/Slave configuration. This
should allow PyPI to be far more stable moving forward and make downtimes much
easier to recover from.
The services are still fronted by Fasty’s CDN and in the new infrastructure
we’ve removed our load balancer and have replaced it by having Fastly handle
the load balancing for us. Additionally we’ve recently setup a static mirror of
PyPI that is updated once every minute. This is hosted on Rackspace cloud as
well but in a separate data center from the rest of PyPI. Fastly is configured
to fall back to this static mirror in the case that neither of the two web
heads are functioning. This should ensure that even in the event of a
catastrophic failure of the PyPI service that the bulk of package installations
should hopefully remain working.
The bad news, (and the “Breakage” from the subject) is that while the new
infrastructure was being planned out, built, and migrated to the “pypissh”
package was forgotten. The pypissh package is an alternative way to upload
packages to PyPI however it is very difficult, because of the way it works, to
provide HA support for it as we’ve set up for everything else. We don’t have
any numbers for how many people are actively using this package but looking
at a roughly 2 week chunk of time in PyPI’s download history, the pypissh
package was downloaded 7 times by a browser, and 7 times by pip. All other
downloads were caused by the mirroring system.
As of right now pypissh is non functional and due to the difficulty in HAing
and monitoring the current setup and because it is apparently has a very
small set of users we would like to effectively kill off this particular
service. Additionally the benefits of pypissh have been reduced now that PyPI
is available over a TLS connection with a well trusted certificate. My question
to you is, is this something that distutils-sig is willing to have happen? If
we are to re-enable pypissh we’ll need to write a new solution to doing it that
can be properly HA’d and we’d prefer to put our efforts into improving things
for a much larger set of people.
So yea, PyPI should be loads more stable and more reliable now.
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
More information about the Distutils-SIG