[Mailman-Developers] Re: [Mailman-Users] Re: MIME messages
Wed, 6 Jun 2001 22:55:54 -0700
On Wed, 6 Jun 2001 18:29:23 -0400 email@example.com (Barry A. Warsaw) wrote:
>[I've moved this discussion over to mailman-developers. -BAW]
>>>>>> "JCL" == J C Lawrence <firstname.lastname@example.org> writes:
> JCL> Ability to unroll quoted printable.
> JCL> Ability to unroll base64 encoded plain text.
> JCL> Ability to strip blocks from message parts that match
> JCL> stated patterns (eg Yahoo/MSN/Hotmail ads, corporate CYA
> JCL> statements, etc).
> JCL> Ability to filter on line length (eg hold for moderation or
> JCL> auto-discard/reject).
>Some of this will be added to mimelib when I get a chance. On the
>Also, just some quick thoughts on de-mime-ing, which also address some
>things that Chuq has brought up, re: regexp filtering,
>I'm of the opinion that regular expression predicates alone aren't
>going to cut it, and that anything more complex is just way way too
>complicated to attempt to expose through an email or web interface.
>Complexity is already our enemy, IMHO.
>So what I'm envisioning is an extensible architecture, a la the
>message pipelines, where each filter is implemented in a separate
>Python module, conforming to a particular, yet-to-be-defined API.
>Mailman will provide a bunch of canned defaults, like
>"strip-mime-leaving-only-text/plain" or "match-vbs-attachments".
>There will probably be some kind of mix-in model for describing the
>action to take when a filter module matches.
While a general and powerful mime handler would be nice, and is
probably the right thing to do in terms of the long-term
development, I think that one can get most of the benefit from a
much simpler solution. A few months ago I hacked together a mime
handler with the goal of making the stuff that comes from Outhouse,
AOL, etc. look like plain-text mail, as well as enforcing
prohibitions on postings images and other binaries. The handler is
based on the mimetools library; it discards sections with certain
mime types specified by per-list regexps, and removes multipart/*
wrappers that become redundant after the stripping. Nothing fancy,
but it has cleaned up 95% of the crap on our lists -- mostly
text/html but also the occassional image/* or application/*.
(Ripping out text/html works because so far it's always accompanied
by corresponding text/plain, except for contributions from
spammers, the deletion of which is a feature.)
I was going to post the patch, but haven't gotten around to
upgrading from 2.0beta6 and porting the code.
I agree that the UI for list admins to define what to do with what,
is likely to be the most challenging part of good general-purpose
solution. The simple mime handler is useful, I think, because
although there's an awful lot that can be done with mime encoding,
the vast majority of email traffic these days comes from just a few
MUAs and is very pedestrian, mime-wise. Having a good set of
canned defaults corresponding to these common cases should work