Easing the burden of upgrades

We recently had an problem on python.org while upgrading to a new release of Mailman that started me thinking about a better way. I'll describe what I think is the problem, sketch out a proposed solution, and throw it out to y'all to discuss. Ken is as much responsible for the good ideas in this message as I am (blame me for the lousy ones); thanks to him for sitting down and thrashing this out first.
The fundamental problem with a system like Mailman is that it is extremely difficult to test. The project is obviously not mature enough to have much of a test suite (if any), and writing one that tests all the interactions between MUA, MTA, Web browser, server, Python, and Mailman will be daunting to say the least. It'll be fantastic when we have even the framework for such a beast, but until then...
So most of our testing involves creating and managing little toy lists, with us flogging the most noticeable features of Mailman to make sure the common stuff hasn't broken. The problem is that for sites using Mailman in an operational system, flag day (i.e. the day the upgrade to a new version occurs) can be pretty traumatic if we missed something crucial, but peculiar about a site. So I'm really concerned with how to make life less stressful for the operations folks who are relying on Mailman for their bread and butter. New features, fixes, etc. are strong incentive for those people to upgrade, but the fear of breakage (resulting in thousands of angry members) probably highly outweighs that incentive. Can we make the transition to a new version more controlled?
You could go with a low tech approach of installing new versions temporarily with a different $prefix, using symbolic links to share list databases and templates, and hacking /etc/aliases as lists are converted to the new version. The one insurmountable problem that I see is the CGI URL. You can't share two different CGI bin dirs without exposing this to users through the list URLs. This is IMO, a showstopper; the most visible aspect of the system should be the most stable.
Ken and I came up with the following architecture, and I'd like to see what y'all think:
Now a new release comes out and Mailman automatically installs into the future parity (more on this below). None of the lists are automatically switched though. The site admin can switch the parity of individual lists. Let's say there's a command called `upgrade'. So
upgrade toylist
would switch `toylist' to the future parity. The site admin could run with this for a while, and get all warm and fuzzy about the new release. He would then
upgrade reallist1
and repeat the process. Let's say he's upgraded three of his thirty lists and now has a lot of confidence in the new version. He then does
upgrade *
This converts all the lists to the future parity. There's one more twist however: the "system" itself is still running on the current parity even though all the lists are running on the future parity. One more command
commit-parity
would now commit to the new release; the future parity becomes the current parity and the current parity becomes the future parity. Maybe this auto-upgrades any still current parity lists. Data such as the list databases and templates would live outside the installed parity source code subdirs (more below).
If at any time before the commit-parity is run, the site admin gets cold feet, he can
downgrade reallistx
or
downgrade *
to return the list or lists to the current parity.
Now, when Mailman is installed, it always installs to the future parity, BUT ONLY IF ALL THE LIST PARITIES AND THE SYSTEM PARITY ARE CURRENT. If it ever notices that some lists are set to the future parity, but the system is still at the current parity, Mailman refuses to install.
There would probably be a command to view the parities for lists and the system. I think the implementation would not be that difficult. A single file containing the parity status for each list and the system would be about the only database you'd need. The installed tree would change a bit. You'd probably have two directories inside $prefix, one that contains the current parity code and one that contains the future parity code. The list databases would live outside these two trees, directly under $prefix. List specific templates would be moved out of the templates directory, into $prefix/lists.
I'd love to get some feedback from people. Is this really a problem that needs to be solved? Does this proposal solve the problem in a useful way? Is the abstraction clear? Is this just total overkill?
-Barry

I think barry's description conveys what we discussed real well. The only addition i'd make is a slightly different command arrangement.
On Wed, 24 Jun 1998, Barry A. Warsaw wrote:
I'd prefer to have a single command that could do everything - i'm thinking 'migrate', which defaults to migrating forward (upgrading). Eg:
migrate toylist migrate -forward mailman-developers [[the "-forward" is unnecessary]] migrate -all [[migrate everything forward]] migrate -back -all [[back off!]] migrate -commit [[make the new version current]]
(Since the command is in mailman's bindir, i guess it could be a general name - but if there's concern, we could distinguish it with mailman's initials - 'mmmigrate' - to convey the mmm mmm delicious experience of changing a production system...-)

"KM" == Ken Manheimer <klm@cnri.reston.va.us> writes:
KM> I think barry's description conveys what we discussed real
KM> well. The only addition i'd make is a slightly different
KM> command arrangement.
I'm definitely open to suggestions here. I just wanted distinct command names so that it was clearer what the intent was.
-Barry

I think barry's description conveys what we discussed real well. The only addition i'd make is a slightly different command arrangement.
On Wed, 24 Jun 1998, Barry A. Warsaw wrote:
I'd prefer to have a single command that could do everything - i'm thinking 'migrate', which defaults to migrating forward (upgrading). Eg:
migrate toylist migrate -forward mailman-developers [[the "-forward" is unnecessary]] migrate -all [[migrate everything forward]] migrate -back -all [[back off!]] migrate -commit [[make the new version current]]
(Since the command is in mailman's bindir, i guess it could be a general name - but if there's concern, we could distinguish it with mailman's initials - 'mmmigrate' - to convey the mmm mmm delicious experience of changing a production system...-)

"KM" == Ken Manheimer <klm@cnri.reston.va.us> writes:
KM> I think barry's description conveys what we discussed real
KM> well. The only addition i'd make is a slightly different
KM> command arrangement.
I'm definitely open to suggestions here. I just wanted distinct command names so that it was clearer what the intent was.
-Barry
participants (2)
-
Barry A. Warsaw
-
Ken Manheimer