[pydotorg-www] [PSF-Members] [Infrastructure] Wiki news?
jnoller at gmail.com
Wed Jan 16 16:30:21 CET 2013
On Wednesday, January 16, 2013 at 10:05 AM, Stephan Deibel wrote:
> M.-A. Lemburg wrote:
> > I've been able to recover the pages from archive.org (http://archive.org) and have also
> > tried Google cache (which failed due to limits on the number of
> > allowed requests) and Yahoo/Bing cache. The latter worked, but
> > only returns a small fraction of the pages we have had in the wiki -
> > about 300+ pages. They are more recent than the archive.org (http://archive.org) ones,
> > though, so I'm trying to merge the Yahoo archive ones back into the
> > archive.org (http://archive.org) recovery.
> > I recovered around 4500 pages from archive.org (http://archive.org)... in HTML. Reimar
> > has a tool to convert them back into wiki markup, which we'll
> > try to use to prepare an import.
> > Meanwhile I'm also trying to see whether we can still extract some
> > data from the broken VM image. It does show traces of the wiki
> > file contents, so the data still exists on the image in some
> > form. Noah already tried extundelete with no success. I'm going
> > to give some of the other tools a try as well, e.g. ext4magic
> > or PhotoRec.
> Phew, sounds like fun... thanks for everyone's work on this!
> Can someone explain (to PSF members list) how it ended up that there
> were no backups? I'm not trying to put anyone on the spot, just trying
> to (a) understand how this happened, making it so hard to recover, and
> (b) make sure that python.org (http://python.org) and other important resources _are_ being
> backed up in a way that prevents this kind of thing from taking down
> services for a long time.
> - Stephan
Noah can expand on this as Infrastructure lead, but the short version is this - last year we got some beefy donations and hosting form OSU/OSL - this allows us to run our own VM infrastructure and isolate/spin up new servers at will (which is great). We've been slowly migrating the old services to the new systems.
Our backups are currently handled via donated services to Tummy.com - in the transition, one of the things which had to be done was update those backups to point to the new virtual machines. This happened for some of the more "mission critical" virtual machines, but unfortunately one of the machines which fell through the cracks was the wiki machine, which hosts not just one Moin instance - but every single wiki the PSF hosts (including the members wiki, etc).
Due to this, when the server was compromised, and the data deleted sometime around the 28th of december due to a 0 day exploit in Moin Moin, we lost all data from the move to OSU.
We have coordinated with Noah, Sean at Tummy, etc to ensure all VMs hosted at the new setup are on a vigorous backup regime (offsite via Tummy). In addition to this, Noah is deploying an on site backup system / coordinating with OSU to ensure we have secondary / on site backups of everything.
This ultimately comes down to a miscommunication/miss on our part, and we are examining ways to backfill our volunteer team with paid services and leveraging the services OSU offers to ensure we have good backups, support and other things we may lack today.
Thanks go out to Noah for identifying and triaging the issue as best as possible and for Marc-Andre and others for looking to recover what they can from the compromised virtual machine and web archives.
All of our infrastructure is managed by Chef (https://github.com/coderanger/psf-chef/tree/master/roles) and Ganeti at OSU.
Currently being backed up are:
This also includes "non PSF" assets such as PyPy assets we are now hosting for free. As I said, this is both a combination of communication issues and volunteer load. The board is examining paid backup/leads where needed and/or leveraging OSU's services and administration.
Director, Python Software Foundation
Chair, PyCon 2013 - http://us.pycon.org
jnoller at gmail.com / jnoller at python.org
More information about the pydotorg-www