-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On Mar 16, 2012, at 10:11 AM, Mark Sapiro wrote:
There are two things going on. There is content filtering, i.e., removal from the message of parts with unwanted MIME types or filename extensions. These parts are simply removed by pipeline/mime_delete.py (which probably needs some changes ported from 2.1, aargh...).
Yeah, that's embarrassing ;). I've started down the road of adding unittests for the code in that module. You'll see the start of that land momentarily.
Then there is what pipeline/scrubber.py does with the remaining message which is remove those message parts which can't be rendered well in a flat, text/plain message and store them aside and replace them by links in the message. The part we can't do in MM 3 is calculate a URL to display/download them.
Yep.
The easiest thing to do, and what I will probably do in my 'death-to-pipermail' branch is to simply scrub out the unwanted parts *after* a copy of the message is sent to the archive queue, but *before* the message is sent to the digest, usenet, and outgoing queues.
I'm not sure about the *before* with respect to usenet and digest and certainly outgoing. Currently in 2.1, we don't scrub (as opposed to content filter) non-digest deliveries unless scrub_nondigest is Yes. We maybe should just drop that option.
We also don't scrub messages for the MIME digest.
I also don't think we scrub messages destined for usenet. I think we let usenet worry about that in the same way we propose to let whatever archiver is configured worry about it.
I don't see a need to handle these differently in MM 3.
ISTM that essence of the scrubber is to turn any remaining text/html parts into plain text, by various means. I think the MM2 scrubber.py module is essentially hopeless, but the basic functionality is useful. I've decided to remove the scrubber in the Pipermail-eradication branch, which will also land momentarily. I think it would be useful though to rewrite the scrubber, boil it down to its essential functionality, and add that to the appropriate spot in the pipeline.
How would you like to take a crack at that?
For now, I'm going to try to implement sending an unscrubbed copy of the message to the archivers and just throwing up our hands for the copy of the message sent to the list members. The nice side-effect of this is that it makes the scrubber *way* simpler!
Perhaps we could keep the scrubber as is except for modifying it to not store scrubbed parts and put some kind of apology in the message rather than the link to the no longer stored content.
Then my lp:~msapiro/mailman/scrubber-fix branch would still be relevant ;)
Yeah, sorry about that. ;) I think scrubber.py was just too nasty to salvage. Something much simpler would still be useful though.
Cheers,
- -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAEBCAAGBQJPZMQ9AAoJEBJutWOnSwa/nFEP/iMDCM+ETv1KV36nP8r/cZfB C50m+K1MUm/MaZpkpQI8980J96QWC1RoWvQ7sQGg2difvDvNwI0JZP4gMBJkHVUu sO/hJZu0BDa28cC9Ww94fRX4ujelm/jesc8td0v02s54FSHUIOgxxDr+sfWNFPvI OpLDJZVtC6LJbDt1IqI2ozxbq/b3hhuaXDbmzIsWqotyZZ/+fQDjgM4L9SCEjhrT tDwQjFhsZmH3m58pFRkP/cOJCV2lKs0MnMGMhELHGkatMGKtVFAuP1e3r24N20yX EVDX/7Dg20BzacNYnAVGnO28sYqb4JltRAb14+IvIMcRzIO+WKKAyJioKX3cohcT 14fhb0agtDPlMMBJw8J5AD9VEimMcZaMmISLpRY6jqkaHRu/4RxZlG3RRWtcBwdS dN0WZnnNx6B+wV5VUJ7Q5WaDO1Xtp0jGHuT96vOQlHDm/+iwwmWWvGH3DQg1yVDN gT2/JyLeXpDprP+qXNPLyWlMlADQjUCq7uvD51J0gcCC6aLanPnM9CuCQXdJRlFl 7g+zI9a17qCdniQcbNUgq+87ektXLi7JCp6nA1yEm0Zaelp3wJC2cB7up9ZaVR7O SX8qpMFnfqFkvsQLC2pLH7plplHpboXWOjLALITFBzasth4hS98oHH+gOJktTKni Erk1f+FsVR9l0Geu2q++ =z6a7 -----END PGP SIGNATURE-----