There's an aspect of the scrubber that isn't going to work in a Mailman 3 world where we have multiple, possibly external, archivers and especially where we don't have such tight integration with Pipermail (or Pipermail at all <wink>).
We can still scrub messages of unwanted content type, but we can't save those parts on the file system and calculate a URL into Pipermail to display them.
I can think of a few ways to handle this.
The easiest thing to do, and what I will probably do in my 'death-to-pipermail' branch is to simply scrub out the unwanted parts *after* a copy of the message is sent to the archive queue, but *before* the message is sent to the digest, usenet, and outgoing queues.
This makes sense because with a model of external archiving, those archivers may make different decisions about what should be removed or displayed from the original message. We can still include a little blurb saying that a part was scrubbed out, and since the messages can have the pre-calculated url to the message in one or more archivers, the user is always free to just click on the url to see the full message, displayed with whatever policy the archiver is configured with.
One possibility is to save the scrubbed part inside the core and provide a url to the REST API for accessing this attachment. This can't be inserted into the scrubbed message directly though because this would be a non-public url to the resource, and it would have to be proxied by the web ui. We need better configuration for integrating the web ui with the core any way (e.g. to calculate the url to the user's options page), so this could be part of that. The interactions are trickier though because you would then have to inform the web ui that there's a new attachment it should proxy.
The other, more elaborate option is to define an IScrubber interface, or alternatively a "primary" IArchiver, that the message can pass through, which would give it an opportunity to provide urls for each of the parts that will be scrubbed out. This is trickier because there can really be only one such thing defined in the system. I think it would be confusing if you received a message that had something like this:
text/html part scrubbed, view it at one of the following:
http://example.com/attachments/foo.html
http://example.org/some/extra/path/bar.html
http://another.archive.example.net/whatever/baz.html
Besides, this may be nearly impossible to do without in-band communication with that external archiver, which is exactly what the RFC 5064 + message-id-hash was supposed to avoid. I think we definitely don't want to have to force such in-band communications to occur in order to scrub messages of unwanted parts.
For now, I'm going to try to implement sending an unscrubbed copy of the message to the archivers and just throwing up our hands for the copy of the message sent to the list members. The nice side-effect of this is that it makes the scrubber *way* simpler!
Any other suggestions?
Cheers, -Barry