[Mailman-Developers] MailMan modifications to support WebDav & Attachments

Bill Bumgarner bbum@codefab.com
Sat, 22 Jan 2000 01:39:46 -0500


In short, it works and I'm a happy coder... been 2.5 years since I have  
written any python and it is a joy to dive back in!  If anyone wants the code,  
let me know and I'll drop a tarball on you-- I'm going to be tweaking it over  
the next few days.  This was developed for a client project, but we [CodeFab]  
convinced them that (a) using MailMan would give us a huge leg up and (b)  
donating the changes back to the community would be the best possible route to  
take....

As such, the implementation was geared towards solving a very specific set of  
requirements for the client.   However, the implementation was done in a  
relative black box-- the implementation was very much done against the question  
"How would one modify MailMan to intelligently support attachments archived to  
web accessible servers?" so as to try and maximize the reusability and value  
to the community [while minimizing the exposure of any proprietary requirements  
or technologies of the client].

Specifically:  MailMan now has the full set of modifications necessary to  
allow a user to post a message with attachments to a MailMan controlled mailing  
list, MailMan will detect the attachments that should be deocded, decode said  
attachments, rewrite the message to indicate that various attachments have been  
decoded, file the decoded message and attachments to a WebDAV server, and send  
the rewritten/urlified resulting message out to  
Usenet/mailing-list/digest/HTTP.

The end result is not pretty-- it works, it is relatively well architected,  
but there is quite a bit of room for optimization/improvement.

The one feature that is truly lacking is that the DAV'd archives are not  
nearly as friendly in structure as are the PiperMail archives.....

Generic feature list:

- ability to decode binary attachments from incoming messages;  configuration  
options to determine which major types should be decoded, where they should be  
filed, how they should be accessed, etc,etc,etc....

- ability to file attachments [via WebDAV] into a repository accessible to the  
end user via HTTP

- ability to rewrite messages to reflect availability of attachments via HTTP

- use of an open protocol [WebDAV] to file/manage attachments

- ability to encapsulate message into an HTTP request made against an  
arbitrary web server / web application

The primary goal of implementation was to render a working solution as quickly  
as possible.  However, a secondary goal of great importance was to render a  
solution that would be palatable to the MailMan / Open Source development  
community and, as such, would be picked up and maintained by them ASAP--  
licensing requires that this piece of the In short, it works!

Specifically:  MailMan now has the full set of modifications necessary to  
allow a user to post a message with attachments to a MailMan controlled mailing  
list, MailMan will detect the attachments that should be deocded, decode said  
attachments, rewrite the message to indicate that various attachments have been  
decoded, file the decoded message and attachments to a WebDAV server, and send  
the rewritten/urlified resulting message out to  
Usenet/mailing-list/digest/HTTP.

The end result is not pretty-- it works, it is relatively well architected,  
but there is quite a bit of room for optimization/improvement.   In particular,  
the archive doesn't have near the friendliness of a pipermail archive-- but  
that shouldn't be *that* hard to add considering how it was originally  
implemented.

---- Summary of implementation ----

Components:

	- apache web server [version 1.3.9 -- latest stable release]

	- mod_dav WebDAV module from Greg Stein  
[http://www.webdav.org/mod_dav/ -- using latest development version from Greg's  
CVS repository.  Can likely move BACK to last stable release-- though a new  
stable release is pending shortly]

	- python [1.5.2 -- latest stable release]

	- pyexpat module [ftp://ftp.cwi.nl/pub/jack/python/ -- version 1.3 --  
includes expat library] -- provides xml parsing toolkit used by both mod_dav  
[expat] and qp_xml/davlib [pyexpat]

	- MailMan [using latest cut from MailMan development cvs server.   
Version 1.2expiremental.  Very close to 1.2 release, so it is stable.  As the  
MailMan community is likely to suck in all the changes that I have made, this  
will be path of least resistance.]

	- qp_xml / davlib / httplib -- various python modules from Greg  
Stein's CVS server.  davlib had to be modified (see below)

	- mimecntl -- [version 1.3 -- Michael P Reilly --  
http://starship.python.net/~arcege/modules/] Slightly modified to deal with  
file handles more cleanly.

	- sendmail -- generic, free, totally unmodified sendmail.  Latest  
version.  Basic configuration;  no spam filtering, but configured to not allow  
relaying.  Configuration will change slightly once in production.

Assembly:

APACHE / mod_dav

	Apache is configured with mod_dav installed and enabled as a  
dynamically loadable module.   Configuration is quite straightforward; the  
directory identified as the root of all WebDAV MailMan operations has DAV  
enabled and requires password authentication to perform any operation.

PYTHON

	Straightforward python 1.5.2 installation.  Dynamic loading of modules  
is enabled [required for loading of expat module]

PYEXPAT

	Straightforward dynamic loading build of pyexpat.   For production, we  
might consider moving to a statically linked version of pyexpat-- this will be  
both more effecient and less prone to failure during upgrades/such-- but will  
require slightly more maintenance during upgrades.

SENDMAIL

	No assembly required.   Configuration details will be documented  
elsewhere [no changes to document now-- it'll all be performance/feature tweaks  
reflective of the production environment].

MAILMAN

	The internal architecture of MailMan was relatively consistent. As  
well, it is clear that the original architects gave a great deal of thought as  
to how to create a very flexible and extendable body of code.   This proved to  
be a huge asset-- while the features required were relatively alien to the  
existing body of code, it was fairly clear how to add functionality while  
following the existing architecture.   The end result is that the architecture  
of MailMan itself is largely untouched.

	The integration of certain required features, however, was not so  
clean or straight forward.  The following specific implementation requirements  
caused a bit of frustration:

	- dynamically manipulatign MIME messages;  it appears that very few  
people have ever given thought to the value of being able to rewrite a MIME  
message by replacing various parts of a message with pieces encoded  
differently.   Thankfully, Michael Reilly had already thought through this  
problem-- the mimecntl.py package proved to be a huge timesaver.

	- WebDAV:  DAV is an immature technology and, as such, the packages  
available to the developer for manipulating DAV accessible repositories tend to  
be extremely primitive.    Largely, this is due to the extremely  
primitive/immature state of the XML marketplace combined with the inherently  
complex nature of a typical XML document.   Similarly, DAV requires a client  
and server that implements HTTP/1.1 -- a version of the protocol that is  
relatively new.   Greg Stein's qp_xml (in conjunction with pyexpat), davlib,  
and httplib saved a huge amount of time.  However, it still required a lot of  
work to generalize access/control of a webdav server from MailMan.

	Synopsis of changes/additions::

	- added a number of configuration options to various different admin  
pages throughout MailMan.   Allow for setting target DAV server, advertised  
Dav'd resource retrieval, whether or not attachments are decoded, which types  
to decode, disabling pre-existing archival functionality, where MailMan->HTTP  
gateway'd messages should be sent, etc....

	- add two additional handlers to the message handling pipeline (see  
HandlerAPI.py in Handlers in MailMan);  Rewrite -- rewrites a mime message,  
decoding attaachments as needed and ToHTTP -- encapsulates a message and sends  
it to the configured web server

	- modified existing post script such that it will instantiate a  
MimeMessage when a particular list is configured to do so.   MimeMessage is a  
new subclass of Message that encapsulates an instance of MIME_recoder from  
mimecntl package.  MIME_recoder allows for random access and recomposition of  
MIME based multipart messages.

	- modified ToArchive handler to be aware of DAV.   Added functionality  
for enumerated messages received by a list and archiving each message into a  
DAV enabled web server.  ToArchive will also automatically validate and create  
any collections [directories] needed on archive server to service a given list  
or message.

	- added a custom subclass of davlib called davxmllib which optimally  
performs various common DAV operations through the use of well defined XML  
structured DAV requests.   davxmllib also presents an error/exception interface  
that is much more desirable to high-level manipulation of a DAV based resource  
server

	- added ability to turn off pipermail [legacy archival system  
previously used in MailMan]