[Mailman-Users] Zest as an Archiver
r.barrett at openinfo.co.uk
Wed Oct 15 13:54:59 CEST 2003
On Wednesday, October 15, 2003, at 10:56 am, Iain Bapty wrote:
> I'm a 3rd year Computer Science student at UMIST in Manchester, UK
> just starting my 3rd year project. My project is to create a new
> archiver component for Mailman based on Zest.
Sounds like an project that could be of interest to the Mailman
community and certainly to me.
That said, I could not find out much about Zest from the sourceforge
project page but presumably you have better access.
> I'm posting this message to both User and Developer lists as I would
> appreciate feedback from as many people as possible (for my
> requirements capture stage).
> I have a number of questions
> * What problems exist with the Pipermail Archiver?
Having contributed patches to to tightly integrate HTdig with
MM/pipermail for archive search and MHonArc with MM/pipermail for HTML
archive index and message page generation you can get some idea of what
some of my interests are. These patches are no more than stop-gaps to
provide a better Mailman-based solution pending a replacement archiver.
But that said, you may find the installation notes associated with
sourceforge patches #444879, #444884 and #820723 identify some of the
deficiencies in Mailman's pipermail archiver. You can find these
patches on sourceforge or on my own site at:
The archiver class structure and code is not that bad. But they are not
that good. The whole thing is a series of interlocked classes calling
back and forth which makes code comprehension, maintenance and
enhancement a real problem for me. I regularly trip over aspects of the
class partitioning and if it were not written in Python (which helps my
comprehension tremendously compared with Perl, C, C++, Java, etc) I
would have given up a long time ago.
The interfaces to allow/facilitate integration of third party elements,
such as a search engine or an alternative HTML page generator to
Mailman's builtin archiver, are limted to non-existent; these elements
must be either very loosely associated or the core pipermail/Mailman
code has to be hacked, fairly brutally in my case because I am a clumsy
I think my primary criticism is the lack of decoupling of the elements
that comprise the archiving facility as a whole; top level archive
organisation and management versus archive HTML page generation for
The per-list options for archive organisation into
yearly/monthly/weekly/daily periods appropriate for each list is a good
feature of pipermail and should be a design objective for its
replacement. This should be related to my comment below on archive
aging and related
> * What features would you like to see in a new Archiver?
A properly decoupled design based on a sound class structure which is
specifically designed too allow extensibility by third party code. I am
thinking here of being able to use sub-classing and/or registration of
call back functions to a well defined framework to add extensions to
the base capability. While it has its own issues, the model of
registered callbacks handling different aspects of the transaction
lifecycle used by Apache modules is interesting and demonstrably
effective in allowing an open-ended and extensible solution; I am
commenting on code organisation not implementation language: stick with
Full top level management of the archives, through extensions/additions
to MM's list admin web GUI would be a win. A significant number of list
owners are using Mailman through things like cPanel, in hosted
environments, which deny them access to the command line. Some would
say that migrating/making available all of MM's command line options
through the web admin GUI would be a good thing.
A frequently requested feature which is now unavailable is the ability
to "edit" the archives to remove general cruft and/or inappropriate
postings via the admin web GUI which, at present, can only be done
using the command line and external editors. That said, I would want to
see such a feature controlled on a per-list basis; some of the lists on
a site I manage are effectively used for archiving email for legal
purposes and such editing is thus prohibited.
A coherent strategy for handling aging of archive content and deletion
of material based on per-list criteria would definitely be on my wish
list. The overall management of a list's archive is as important as
the minutiae of archive page generation.
The handling of multipart MIME messages in the HTML archive needs to be
improved; MHonArc has a reputation of being better than MM/pipermail in
Any new archiver must aim to produce a unique identifier, invariant
over HTML archive rebuilds, for each posting so that externally held
references to the mail archives are undisturbed by rebuilding, except
where material has been expunged.
I consider the private/public archive facility of MM/pipermail to be a
'must have' feature, which must be preserved over archive search. The
ability to change a list from private to public and vice versa without
having to rebuild the archives is important.
Archiving must be fast and have sensible performance characteristics
when dealing with very large archives. pipermail/MM is weak in this
If archives are structured by period, the threading should extend
across the period boundaries.
A number of people have asked for the ability to ask for an archived
posting to be mailed out to them in the same manner as when it was
originally distributed to susbscribers.
Must be a lot more things I want but I'll let you prompt for for
further input if you want it.
You could to worse than take a look through the mailman-users archives
for the last 12 months to find a fair number of criticisms/request
regarding MM archiving capability but I guess you already have that in
> * Would you be willing for me to email you questions in the future
> (not on the groups)?
Fine by me.
> Any replies are very much appreciated.
> Thanks a lot
Best of luck. Keep us posted on your progress.
> Iain Bapty
Richard Barrett http://www.openinfo.co.uk
More information about the Mailman-Users