[Mailman-Users] Zest as an Archiver

Richard Barrett r.barrett at openinfo.co.uk
Wed Oct 15 13:54:59 CEST 2003


Hi Iain

On Wednesday, October 15, 2003, at 10:56  am, Iain Bapty wrote:

> Hey,
>
> I'm a 3rd year Computer Science student at UMIST in Manchester, UK 
> just starting my 3rd year project. My project is to create a new 
> archiver component for Mailman based on Zest.

Sounds like an project that could be of interest to the Mailman 
community and certainly to me.

That said, I could not find out much about Zest from the sourceforge 
project page but presumably you have better access.

> I'm posting this message to both User and Developer lists as I would 
> appreciate feedback from as many people as possible (for my 
> requirements capture stage).
>
> I have a number of questions
>
>    * What problems exist with the Pipermail Archiver?

Having contributed patches to to tightly integrate HTdig with 
MM/pipermail for archive search and MHonArc with MM/pipermail for HTML 
archive index and message page generation you can get some idea of what 
some of my interests are. These patches are no more than stop-gaps to 
provide a better Mailman-based solution pending a replacement archiver. 
But that said, you may find the installation notes associated with 
sourceforge patches #444879, #444884 and #820723 identify some of the 
deficiencies in Mailman's pipermail archiver. You can find these 
patches on sourceforge or on my own site at:

http://www.openinfo.co.uk/mailman/index.html

The archiver class structure and code is not that bad. But they are not 
that good. The whole thing is a series of interlocked classes calling 
back and forth which makes code comprehension, maintenance and 
enhancement a real problem for me. I regularly trip over aspects of the 
class partitioning and if it were not written in Python (which helps my 
comprehension tremendously compared with Perl, C, C++, Java, etc) I 
would have given up a long time ago.

The interfaces to allow/facilitate integration of third party elements, 
such as a search engine or an alternative HTML page generator to 
Mailman's builtin archiver, are limted to non-existent; these elements 
must be either very loosely associated or the core pipermail/Mailman 
code has to be hacked, fairly brutally in my case because I am a clumsy 
person.

I think my primary criticism is the lack of decoupling of the elements 
that comprise the archiving facility as a whole; top level archive 
organisation and management versus archive HTML page generation for 
instance.

The per-list options for archive organisation into 
yearly/monthly/weekly/daily periods appropriate for each list is a good 
feature of pipermail and should be a design objective for its 
replacement. This should be related to my comment below on archive 
aging and related
maintenance.

>    * What features would you like to see in a new Archiver?

A properly decoupled design based on a sound class structure which is 
specifically designed too allow extensibility by third party code. I am 
thinking here of being able to use sub-classing and/or registration of 
call back functions to a well defined framework to add extensions to 
the base capability. While it has its own issues, the model of 
registered callbacks handling different aspects of the transaction 
lifecycle used by Apache modules is interesting and demonstrably 
effective in allowing an open-ended and extensible solution; I am 
commenting on code organisation not implementation language: stick with 
Python.

Full top level management of the archives, through extensions/additions 
to MM's list admin web GUI would be a win. A significant number of list 
owners are using Mailman through things like cPanel, in hosted 
environments, which deny them access to the command line. Some would 
say that migrating/making available all of MM's command line options 
through the web admin GUI would be a good thing.

A frequently requested feature which is now unavailable is the ability 
to "edit" the archives to remove general cruft and/or inappropriate 
postings via the admin web GUI which, at present, can only be done 
using the command line and external editors. That said, I would want to 
see such a feature controlled on a per-list basis; some of the lists on 
a site I manage are effectively used for archiving email for legal 
purposes and such editing is thus prohibited.

A coherent strategy for handling aging of archive content and deletion 
of material based on per-list criteria would definitely be on my wish 
list.  The overall management of a list's archive is as important as 
the minutiae of archive page generation.

The handling of multipart MIME messages in the HTML archive needs to be 
improved; MHonArc has a reputation of being better than MM/pipermail in 
this respect.

Any new archiver must aim to produce a unique identifier, invariant 
over HTML archive rebuilds, for each posting so that externally held 
references to the mail archives are undisturbed by rebuilding, except 
where material has been expunged.

I consider the private/public archive facility of MM/pipermail to be a 
'must have' feature, which must be preserved over archive search. The 
ability to change a list from private to public and vice versa without 
having to rebuild the archives is important.

Archiving must be fast and have sensible performance characteristics 
when dealing with very large archives. pipermail/MM is weak in this 
respect.

If archives are structured by period, the threading should extend 
across the period boundaries.

A number of people have asked for the ability to ask for an archived 
posting to be mailed out to them in the same manner as when it was 
originally distributed to susbscribers.

Must be a lot more things I want but I'll let you prompt for for 
further input if you want it.

You could to worse than take a look through the mailman-users archives 
for the last 12 months to find a fair number of criticisms/request 
regarding MM archiving capability but I guess you already have that in 
hand.

>    * Would you be willing for me to email you questions in the future
>      (not on the groups)?
>

Fine by me.

> Any replies are very much appreciated.
>
> Thanks a lot
>

Best of luck. Keep us posted on your progress.

> Iain Bapty
-----------------------------------------------------------------------
Richard Barrett                               http://www.openinfo.co.uk





More information about the Mailman-Users mailing list