I have some rather big changes ready for MM2.1.5 that I wanted to describe and get feedback from you. While I have this stuff working and ready to be checked in, we will definitely need some beta testing before unleashing on the world. I hope you'll be able to help with that. I think these changes are important enough to put into 2.1 rather than waiting for any future major release.
The first big change is the most externally visible one. I believe the current scheme for bounce processing is unusable in today's world of MyDoom sender forgeries, anti-virus front-ends on remote SMTPs and the like. On python.org we've seen many cases where people are getting unexpectedly bounce disabled, even though they receive all legitimate traffic to a mailing list. What's happening? It's simply that, while we have spam and virus defenses in place on python.org, some crap still gets through. Imagine I'm on a busy list and I forward barry@python.org through my home ISP, which has a virus and spam detector on that address. Now say that list gets 100 msgs/day and 1% of those messages are false negative spams. The message gets onto the list, but my ISP catches them and rejects them, which triggers a bounce and thus my score's just been increased by 1. I only need one sneaky spam per day to get me bounce disabled, even though most of the mail is legit and gets through.
So I've implemented a revised scheme that we've talked about before, based on what I believe ezmlm does. All the bounce parameters are still in effect, however when a member's bounce score reaches the threshold, we now send a specially prepared probe message containing a VERP'd sender with an unguessable token. When we send the probe, the member's bounce score is reset. If the probe bounces, then we disable the member and do the normal reminders. If the probe doesn't bounce the member would stay enabled and their score starts accumulating from zero again. A benefit of this rewrite is that we can include in the probe, the last bouncing message as a sample to the user so they can start to get a clue as to why they're getting bounce scored.
This change has prompted an internal rewrite of the pending database. Previously the entire site had a single pending.pck file for all actions requiring confirmation by the user -- held subscription cancellation, subscription, unsub, and change of address confirmations, and bounce re-enable confirmations. This was a problem for several reasons, including that every list had to block on acquiring the lock for this file.
Now, each list has its own pending.pck file and while the list lock must be acquired to update this database, at least this doesn't block other lists from doing things. The upgrade script attempts to migrate the single shared pending.pck file to the individual list files, but the conversion is difficult because the associated list is not stored with most of the records in that file. I do my best, but it's possible that some pending actions may get lost.
The other big change is a purely internal one, but it may affect the work flow for some admins. I've changed the qfiles file format so that only one file is used per message. Previously we had one file for the message and one file for the metadata. Now, a single pickle file is used with the first object in the pickle being the message object and the second being the metadata dictionary. This approach has several advantages. The code is simpler, there are no race conditions opportunities, we can't possibly have orphaned data files, and probably most importantly we now only need half the inodes we did before. In addition, I've decided to turn on fsync'ing for this new qfile all the time, so storage should be more reliable too.
The downside is that I've removed the ability to set a METADATA_FORMAT. We use Python pickles and that's it. I doubt many people have been using (or were even aware of) the alternatives, although I've had the occasional bug report on them so I know that number is non-zero. The other downside for some people is that the behavior of SAVE_MSGS_AS_PICKLES=False will change. When that non-standard setting is used, we'll still write everything to a pickle file, but we'll use text pickles instead of the more efficient (but not human readable) binary pickles. Also, we'll write the message object as a pickled string instead of a pickled object. Again, this will be less efficient because we'll have to parse the message every time it's dequeued, but this option will still allow people to edit queued messages with a normal text editor, albeit less conveniently.
I think this trade-off is worth it. The upgrade script will combine any existing qfiles so you won't have to clear your queue when upgrading. To be safe, you /will/ have to stop Mailman, your MTA, and your web server before upgrading (but this was always recommended practice).
I intend to commit these changes to CVS within a week and will probably release a 2.1.5 alpha. This will touch a lot of files, but it will hopefully make the system more efficient and usable. Once this is done I hope to have more time to start addressing other bugs and issues in the 2.1 branch.
Again, when everything's checked it, please test things out as much as possible, especially if you are using older Python versions. I've tested primarily with Python 2.3.3 but I was careful not to use any feature that isn't supported in Python 2.1.3. I might have missed something though.
-Barry