[Mailman-Developers] Please Allow Me To Introduce Myself...

Barry A. Warsaw barry@zope.com
Thu, 7 Mar 2002 00:20:26 -0500


>>>>> "JJB" == James J Besemer <jb@cascade-sys.com> writes:

    >> You'd think!  I've had a couple of patches contributed that
    >> filter out HTML, but I've not been able to whip them into shape
    >> for inclusion.  I've basically given up hope for MM2.1, but
    >> will look at it again for the next release.  The problem is
    >> that the naive approach isn't difficult, but for it to be
    >> robust is much more difficult.

    JJB> When you find more time I'd appreciate some more background
    JJB> on this.

I really owe Les feedback on his patch; it's the one I've done the
most work on.  It's on my laptop and if I can sync it up to cvs, I'll
try to reboot my efforts tomorrow.

    JJB> Wanting to filter out HTML (nb. from AOL accounts) is the #1
    JJB> gripe from my users.

    JJB> The Python library has an HTML parser that I've used before
    JJB> and it works pretty well.  I used it to translate HTML to
    JJB> HTML, inserting data in various named fields.  But removal of
    JJB> the HTML is the default action of the code.  Of course you
    JJB> don't really want simply to remove it.  E.g., you'd want to
    JJB> include HREF's somehow, substitute the description for
    JJB> images, etc.

Here's the basic problem: there are lots of different use cases that
fall under the rubric "filtering HTML".  Some people want it stripped,
some want it transformed, do we preserve links, etc, etc.  It's hard
to support everything everyone wants to do with HTML messages, /and/
do it in a way that's intuitive and easy to configure through the web.
I'm not saying it's impossible, but it's a lot of work, and MM2.1 has
to get to beta RSN.  Plus, I think there are viable options (for the
short term) without having this functionality in Mailman proper.
E.g. demime.

But I'll give it another look, and maybe something simple can satisfy
80% of the people out there.

-Barry