[Mailman-Developers] Re: [Mailman-Users] Re: MIME messages

Les Niles les@2pi.org
Wed, 6 Jun 2001 22:55:54 -0700


On Wed, 6 Jun 2001 18:29:23 -0400 barry@digicool.com (Barry A. Warsaw) wrote:
>
>[I've moved this discussion over to mailman-developers.  -BAW]
>
>>>>>> "JCL" == J C Lawrence <claw@2wire.com> writes:
>
>    JCL>   Ability to unroll quoted printable.
>
>    JCL>   Ability to unroll base64 encoded plain text.
>
>    JCL>   Ability to strip blocks from message parts that match
>    JCL> stated patterns (eg Yahoo/MSN/Hotmail ads, corporate CYA
>    JCL> statements, etc).
>
>    JCL>   Ability to filter on line length (eg hold for moderation or
>    JCL>   auto-discard/reject).  
>
>Some of this will be added to mimelib when I get a chance.  On the
>...
>
>Also, just some quick thoughts on de-mime-ing, which also address some
>things that Chuq has brought up, re: regexp filtering,
>auto-discarding, etc.
>
>I'm of the opinion that regular expression predicates alone aren't
>going to cut it, and that anything more complex is just way way too
>complicated to attempt to expose through an email or web interface.
>Complexity is already our enemy, IMHO.
>
>So what I'm envisioning is an extensible architecture, a la the
>message pipelines, where each filter is implemented in a separate
>Python module, conforming to a particular, yet-to-be-defined API.
>Mailman will provide a bunch of canned defaults, like
>"strip-mime-leaving-only-text/plain" or "match-vbs-attachments".
>There will probably be some kind of mix-in model for describing the
>action to take when a filter module matches.
>...

While a general and powerful mime handler would be nice, and is
probably the right thing to do in terms of the long-term
development, I think that one can get most of the benefit from a
much simpler solution.  A few months ago I hacked together a mime
handler with the goal of making the stuff that comes from Outhouse,
AOL, etc. look like plain-text mail, as well as enforcing
prohibitions on postings images and other binaries.  The handler is
based on the mimetools library; it discards sections with certain
mime types specified by per-list regexps, and removes multipart/*
wrappers that become redundant after the stripping.  Nothing fancy,
but it has cleaned up 95% of the crap on our lists -- mostly
text/html but also the occassional image/* or application/*.
(Ripping out text/html works because so far it's always accompanied
by corresponding text/plain, except for contributions from
spammers, the deletion of which is a feature.)

I was going to post the patch, but haven't gotten around to
upgrading from 2.0beta6 and porting the code.

I agree that the UI for list admins to define what to do with what,
is likely to be the most challenging part of good general-purpose
solution.  The simple mime handler is useful, I think, because
although there's an awful lot that can be done with mime encoding,
the vast majority of email traffic these days comes from just a few
MUAs and is very pedestrian, mime-wise.  Having a good set of
canned defaults corresponding to these common cases should work
pretty well.

  -les  les@2pi.org