[Mailman-Developers] Stripping HTML

Bill Bumgarner bbum@codefab.com
Sat, 12 Feb 2000 12:42:11 -0500


Ricardo,

    While you are at it, add the ability to strip out other formatting-- like the
non-HTML <param>font</param stuff you find from certain clients [it isn't HTML, is
it]?

    This is not quite as straightforward as one might expect.  In particular, you
need to be able to actually rewrite attachments in the MIME body of the message--
something that the "out of the box" Mailman cannot do.

    However, I had to do exactly that-- including reencoding the rewritten message
back into a MIME compliant beast-- to make Mailman rip apart messages with
attachments, file the attachments to a WebDAV server, and rewrite the attachment
such that it is text/plain with a body that has an URL pointing to the filed
attachment.

    It would be very useful to the project that started all of this if Mailman
could "clean" the messages of formatting information.  If  you send me a
function/class/code snippet that does the stripping in (as someone else said)
"just the right way", I would be happy to integrate it as a configurable option in
my Rewrite module... and, of course, make the whole ball of wax available to the
community at large (including helping Barry integrate all of this stuff into the
core distribution).

    thanks!
    b.bum