[I've moved this discussion over to mailman-developers. -BAW]
"JCL" == J C Lawrence <claw@2wire.com> writes:
JCL> Ability to unroll quoted printable.
JCL> Ability to unroll base64 encoded plain text.
JCL> Ability to strip blocks from message parts that match
JCL> stated patterns (eg Yahoo/MSN/Hotmail ads, corporate CYA
JCL> statements, etc).
JCL> Ability to filter on line length (eg hold for moderation or
JCL> auto-discard/reject).
Some of this will be added to mimelib when I get a chance. On the plus side, Python 2.1 and the future 2.2 has some nice support for all this via its unicode codecs. I.e. in the Python CVS there's now a codec for quoted printable so you can essentially say something like:
subject = msg['subject'].decode('quopri')
if your subject header is quoted as per RFC 2047. I've only played around with this stuff a little bit, to get a feel for what you can do, so I haven't thought about APIs or doing the actual coding yet.
The downside is that I'm still targetting Mailman 2.1 at Python 2.0, so some of these features may not be available.
Also, just some quick thoughts on de-mime-ing, which also address some things that Chuq has brought up, re: regexp filtering, auto-discarding, etc.
I'm of the opinion that regular expression predicates alone aren't going to cut it, and that anything more complex is just way way too complicated to attempt to expose through an email or web interface. Complexity is already our enemy, IMHO.
So what I'm envisioning is an extensible architecture, a la the message pipelines, where each filter is implemented in a separate Python module, conforming to a particular, yet-to-be-defined API. Mailman will provide a bunch of canned defaults, like "strip-mime-leaving-only-text/plain" or "match-vbs-attachments". There will probably be some kind of mix-in model for describing the action to take when a filter module matches.
Then the admins can choose which filters they want, and what order they want to run them in. I'd actually envisioned something like this for the delivery pipeline, but that (giving admins control) has turned out to be not as necessary.
Even this may turn out to be too complex, so there may be yet another
level of abstraction that a site admin can glom together, so that a
list admin would be presented with a limited set of rhythms', themes', or `styles' that they can pick and choose from. I've
thought about a similar mechanism for list themes, like "announce-only
list" or "read-only mirror list".
The point is that there's a lot we can do very cleanly at the Python level, but that the more configurability we expose at the web/email interface, the less usable it becomes, IMHO. So, I'm thinking seriously about ways to preserve the power Mailman can provide to the Python hacker while employing abstractions to reduce the cognitive load on the list administrators and users.
-Barry