MM3: Content filter rules
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
As you know, Mailman 2 can filter the content of a message before it's
forwarded on to the list membership. It can reorganize the MIME
structure of a message, based on settings for MIME type and file
extension.
The rules are fairly complex though:
If the outer MIME type or file extension matches a filter pattern,
the entire message is disposed of.If there are pass filters and the outer MIME type or file extension
does not match a pass filter, the message is disposed of.If a subpart's MIME type or file extension matches a filter pattern,
that part is disposed of.If there are pass filters and a subpart's MIME type or file
extension does not match a pass filter, the subpart is disposed of.After all that, multipart/alternatives can be collapsed.
After all that, text/html parts can be converted to text/plain
After all that, if the outer message's body is empty and it has
exactly one subpart, the subpart "becomes" the outer part.
Disposal of messages can be one of:
- Reject the message to the original author
- Forward the message to the list owner and discard
- Preserve the message in the 'bad' queue and discard
The Zen of Python says:
"Complex is better than complicated." "If the implementation is hard to explain, it's a bad idea."
and I think the implementation is both complicated and hard to
explain, and the u/i is no model of comprehension. I'm not entirely
sure how people use this feature though.
Some of the python.org lists have pass-types allowing multipart/mixed,
multipart/alternative and text/plain, while filtering out various file
extensions.
So I'd like to solicit your input on how you use the feature, and if
you have any ideas for an approach that would be easier to understand,
more useful, or both.
Barry
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin)
iEYEARECAAYFAkmsFQ4ACgkQ2YZpQepbvXEIAACffKwhPLRK4hIbJ7Hs2Cfsitjk xr8An1RqTVblud9OipgfanywU3Wr8kWV =E7Nw -----END PGP SIGNATURE-----
On Mon, Mar 02, 2009 at 12:19:10PM -0500, Barry Warsaw wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
As you know, Mailman 2 can filter the content of a message before it's
forwarded on to the list membership. It can reorganize the MIME
structure of a message, based on settings for MIME type and file
extension.The rules are fairly complex though:
[...]
and I think the implementation is both complicated and hard to
explain, and the u/i is no model of comprehension. I'm not entirely
sure how people use this feature though.
[...]
So I'd like to solicit your input on how you use the feature, and if
you have any ideas for an approach that would be easier to understand,
more useful, or both.
Ok, I'm lazy, and could never really be bothered using content-filters all that much in MM2.
I'm wondering if it would make sense to use mailcap(5) -- either the system's, or MM user's should one exist, to generate a list of MIME-Types for Mailman, and nice tickboxes alongside to select "Reject, Discard, Allow, Forward to Admins" or something similar (read-in mailcap on each load of the relevant part of the admin web-URI?/each execution of the command-line util?).
It would be grand (IMO) to have a Reject message using something like: (auto-generated) "$LISTNAME doesn't accept $MIMETYPE" plus, perhaps the opportunity to provide some more 'useful' information: "your mail client's broken", "HTML's for the web, not email", "we don't like MS Word", "stop using proprietry formats" or some other customizable message
Maybe I should draw what I'd like...
Perhaps (shudder), allowing an over-ride to let *some* users use a specified content, whilst rejecting for others...
Have I over-complicated things? Or am I barking mad?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Mar 2, 2009, at 12:39 PM, Adam McGreggor wrote:
I'm wondering if it would make sense to use mailcap(5) -- either the system's, or MM user's should one exist, to generate a list of MIME-Types for Mailman, and nice tickboxes alongside to select
"Reject, Discard, Allow, Forward to Admins" or something similar (read-in
mailcap on each load of the relevant part of the admin web-URI?/each execution of the command-line util?).
Python has a mailcap module, so that would make the most sense.
It would be grand (IMO) to have a Reject message using something like: (auto-generated) "$LISTNAME doesn't accept $MIMETYPE" plus, perhaps
the opportunity to provide some more 'useful' information: "your mail client's broken", "HTML's for the web, not email", "we don't like MS Word", "stop using proprietry formats" or some other customizable message
This isn't an area I've addressed yet, but customizable messages
needs to be thought about, especially when multilingual rejection
messages are considered.
Maybe I should draw what I'd like...
Perhaps (shudder), allowing an over-ride to let *some* users use a specified content, whilst rejecting for others...
Have I over-complicated things? Or am I barking mad?
I'm not sure per-user filters are feasible. I'm also not sure you
want to see 35 or more checkboxes on the whitelist/blacklist page.
That's how many entries are in mailcap.getcaps() on OS X.
Barry
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin)
iEYEARECAAYFAkmsRucACgkQ2YZpQepbvXFvAgCeN93hjXPTQuCeDo5PFz8N2dsO cYIAn3L2eGOU6+3BvpBBc8erMm2AewM0 =y2uD -----END PGP SIGNATURE-----
On Mon, Mar 02, 2009 at 03:51:51PM -0500, Barry Warsaw wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Mar 2, 2009, at 12:39 PM, Adam McGreggor wrote:
I'm wondering if it would make sense to use mailcap(5) -- either the system's, or MM user's should one exist, to generate a list of MIME-Types for Mailman, and nice tickboxes alongside to select
"Reject, Discard, Allow, Forward to Admins" or something similar (read-in
mailcap on each load of the relevant part of the admin web-URI?/each execution of the command-line util?).Python has a mailcap module, so that would make the most sense.
Ah-ha ;)
It would be grand (IMO) to have a Reject message using something like: (auto-generated) "$LISTNAME doesn't accept $MIMETYPE" plus, perhaps
the opportunity to provide some more 'useful' information: "your mail client's broken", "HTML's for the web, not email", "we don't like MS Word", "stop using proprietry formats" or some other customizable messageThis isn't an area I've addressed yet, but customizable messages
needs to be thought about, especially when multilingual rejection
messages are considered.
I'd not thought about that, but presumably, that's something that can be addressed via the standard "stanza-in-a-po-file" method?
I've not got any multi-lingual lists running (yet), and am a bit in the dark here (although, i rejoined the users/devs lists over an i18n issue, oddly enough)
Maybe I should draw what I'd like...
Perhaps (shudder), allowing an over-ride to let *some* users use a specified content, whilst rejecting for others...
Have I over-complicated things? Or am I barking mad?
I'm not sure per-user filters are feasible. I'm also not sure you
want to see 35 or more checkboxes on the whitelist/blacklist page.
Fair point that, yes. Perhaps not in a listing per the current set-up, but perhaps some sort of 'advanced/more' settings giving the options? I dunno. I could see DB hackery becoming quite messy (particularly if allowing given email addresses different settings for different lists)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Mar 3, 2009, at 1:46 PM, Adam McGreggor wrote:
This isn't an area I've addressed yet, but customizable messages needs to be thought about, especially when multilingual rejection messages are considered.
I'd not thought about that, but presumably, that's something that
can be addressed via the standard "stanza-in-a-po-file" method?I've not got any multi-lingual lists running (yet), and am a bit in
the dark here (although, i rejoined the users/devs lists over an i18n issue, oddly enough)
The thing is, I think Mailman has a pretty good story for multilingual
system messages, such as canned strings in the source. We're okay,
but will do better with the web templates. But it's the strings that
come from users that I think need to be handled. For example, say you
wanted a list description for your French list in both French and
English, right now you can't do that. I'd like for it to be possible
to set those kinds of messages in multiple languages.
Maybe I should draw what I'd like...
Perhaps (shudder), allowing an over-ride to let *some* users use a specified content, whilst rejecting for others...
Have I over-complicated things? Or am I barking mad?
I'm not sure per-user filters are feasible. I'm also not sure you want to see 35 or more checkboxes on the whitelist/blacklist page.
Fair point that, yes. Perhaps not in a listing per the current set-up, but perhaps some sort of 'advanced/more' settings giving the
options? I dunno. I could see DB hackery becoming quite messy (particularly if allowing given email addresses different settings for different lists)
I've thought more about this, and I've also gotten the current module
working again. Ultimately I think fleshing out the plugin
architecture will be the right thing here, and then I can migrate the
current mime-delete module into a plugin (well, a built-in plugin).
It should be easy for others to write handlers that can do MIME filter
in different ways.
Barry
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin)
iEYEARECAAYFAkmt74gACgkQ2YZpQepbvXGS/gCfd/7NNRh+7eXrsePVO0OAHepp P1YAn0O85KDQBjqPJ9SsFomUVWRr3ror =cuFN -----END PGP SIGNATURE-----
On Tue, Mar 03, 2009 at 10:03:35PM -0500, Barry Warsaw wrote:
For example, say you wanted a list description for your French list in both French and English, right now you can't do that. I'd like for it to be possible to set those kinds of messages in multiple languages.
That would rock! Please let's have that, yes.
Cheers,
Cristóbal Palmer ibiblio.org systems administrator cdla.unc.edu research assistant
Barry Warsaw wrote:
So I'd like to solicit your input on how you use the feature, and if
you have any ideas for an approach that would be easier to understand,
more useful, or both.
My typical list is set up as follows to allow plain text only. List mail is not signed so I don't have that issue.
filter_content -> Yes filter_mime_types -> empty pass_mime_types multipart message/rfc822 text/plain text/html filter_filename_extensions -> default, but irrelevant pass_filename_extensions -> empty collapse_alternatives -> Yes convert_html_to_plaintext -> Yes filter_action -> Reject
I have one list which is used to discuss the planning for an annual event (century bicycle ride) which we run as a fund raiser. Since it is not possible to get all the people involved to understand that there are alternatives to attaching spread sheets and word processing documents, this list is set up as above except that pass_mime_types is set to
multipart message/rfc822 text/plain text/html application/pdf application/vnd.oasis.opendocument.spreadsheet application/vnd.ms-excel application/vnd.oasis.opendocument.text application/vnd.openxmlformats-officedocument.wordprocessingml.document application/msword
This generally works except for one user's misconfigured Microsoft Outlook/Exchange that attaches a PDF as application/octet-stream. There's no good way around that (other than fixing the source). We could accept all MIME types and filter only on file name extension, but that would accept anything without a name or with a name without an extension.
In fact, given the nature of this list and its membership, I could probably accept everything and just collapse alternatives and convert HTML to plain text and it would be OK.
BTW, as an aside regarding collapse_alternatives, I have seen on a non-Mailman list, non-compliant posts from a Lotus notes user that have the text/html alternative preceding the text/plain alternative in a multipart/alternative part. I don't know what you do about that...
As far as ideas for improvement go, I don't know if anyone actually uses filter_mime_types. It seems best to "whitelist" what you want rather than trying to "blacklist" what you don't want. I think we could probably do without filter_mime_types.
The other confusing point for some users is they have to allow various multipart/* types in order to allow the sub-parts they want. Possibly we could do something where you just specify the elemental content types you want to allow, and we examine all multipart parts implicitly and accept those elemental sub-parts that are allowed.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro wrote:
Barry Warsaw wrote:
So I'd like to solicit your input on how you use the feature, and if
you have any ideas for an approach that would be easier to understand,
more useful, or both.My typical list is set up as follows to allow plain text only. List mail is not signed so I don't have that issue.
filter_content -> Yes filter_mime_types -> empty pass_mime_types multipart message/rfc822 text/plain text/html filter_filename_extensions -> default, but irrelevant pass_filename_extensions -> empty collapse_alternatives -> Yes convert_html_to_plaintext -> Yes filter_action -> Reject
I have one list which is used to discuss the planning for an annual event (century bicycle ride) which we run as a fund raiser. Since it is not possible to get all the people involved to understand that there are alternatives to attaching spread sheets and word processing documents, this list is set up as above except that pass_mime_types is set to
multipart message/rfc822 text/plain text/html application/pdf application/vnd.oasis.opendocument.spreadsheet application/vnd.ms-excel application/vnd.oasis.opendocument.text application/vnd.openxmlformats-officedocument.wordprocessingml.document application/msword
This generally works except for one user's misconfigured Microsoft Outlook/Exchange that attaches a PDF as application/octet-stream. There's no good way around that (other than fixing the source). We could accept all MIME types and filter only on file name extension, but that would accept anything without a name or with a name without an extension.
In fact, given the nature of this list and its membership, I could probably accept everything and just collapse alternatives and convert HTML to plain text and it would be OK.
BTW, as an aside regarding collapse_alternatives, I have seen on a non-Mailman list, non-compliant posts from a Lotus notes user that have the text/html alternative preceding the text/plain alternative in a multipart/alternative part. I don't know what you do about that...
As far as ideas for improvement go, I don't know if anyone actually uses filter_mime_types. It seems best to "whitelist" what you want rather than trying to "blacklist" what you don't want. I think we could probably do without filter_mime_types.
The other confusing point for some users is they have to allow various multipart/* types in order to allow the sub-parts they want. Possibly we could do something where you just specify the elemental content types you want to allow, and we examine all multipart parts implicitly and accept those elemental sub-parts that are allowed.
If there was a way to have the filters also work based on file extensions, that would be a definite plus. Granted, mime types are usually right, but as you pointed out, not always. Same thing goes for file extensions. Its a messy world!
Bob
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Mar 2, 2009, at 3:22 PM, Bob Puff@NLE wrote:
If there was a way to have the filters also work based on file
extensions, that would be a definite plus. Granted, mime types are
usually right, but as you pointed out, not always. Same thing goes
for file extensions. Its a messy world!
The current mime-delete module does work on both MIME type and file
extensions.
Barry
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin)
iEYEARECAAYFAkmsRZIACgkQ2YZpQepbvXH2swCfTo8XQmL9hwBSLWA5TkUfwDYF dcMAniLnnE4gL2nUD8wClby81iSHv61U =evRn -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Mar 2, 2009, at 2:15 PM, Mark Sapiro wrote:
My typical list is set up as follows to allow plain text only. List mail is not signed so I don't have that issue.
filter_content -> Yes filter_mime_types -> empty pass_mime_types multipart message/rfc822 text/plain text/html filter_filename_extensions -> default, but irrelevant pass_filename_extensions -> empty collapse_alternatives -> Yes convert_html_to_plaintext -> Yes filter_action -> Reject
It's interesting how simple it is for you to explain what you want,
but how complicated it is for you to make Mailman do it!
I have one list which is used to discuss the planning for an annual event (century bicycle ride) which we run as a fund raiser. Since it is not possible to get all the people involved to understand that there are alternatives to attaching spread sheets and word processing documents, this list is set up as above except that pass_mime_types is set to
multipart message/rfc822 text/plain text/html application/pdf application/vnd.oasis.opendocument.spreadsheet application/vnd.ms-excel application/vnd.oasis.opendocument.text application/vnd.openxmlformats- officedocument.wordprocessingml.document application/msword
This generally works except for one user's misconfigured Microsoft Outlook/Exchange that attaches a PDF as application/octet-stream. There's no good way around that (other than fixing the source). We could accept all MIME types and filter only on file name extension, but that would accept anything without a name or with a name without an extension.
Touching on the plugin idea, let's say you had a generic whitelist
filter. It would seem that some level of user configurability should
be exposed in order to allow you to configure this.
In fact, given the nature of this list and its membership, I could probably accept everything and just collapse alternatives and convert HTML to plain text and it would be OK.
BTW, as an aside regarding collapse_alternatives, I have seen on a non-Mailman list, non-compliant posts from a Lotus notes user that have the text/html alternative preceding the text/plain alternative in a multipart/alternative part. I don't know what you do about that...
I'm not sure either, except perhaps it shouldn't be called
'collapse_alternatives' but instead something like
'select_text_plain_alternative'.
As far as ideas for improvement go, I don't know if anyone actually uses filter_mime_types. It seems best to "whitelist" what you want rather than trying to "blacklist" what you don't want. I think we could probably do without filter_mime_types.
And probably the same with extension types? I think I agree with you!
The other confusing point for some users is they have to allow various multipart/* types in order to allow the sub-parts they want. Possibly we could do something where you just specify the elemental content types you want to allow, and we examine all multipart parts implicitly and accept those elemental sub-parts that are allowed.
This is a good idea too. Barry
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin)
iEYEARECAAYFAkmsRV8ACgkQ2YZpQepbvXEi9gCfaHM0wf5dkmi1LQWzTo8RYnj8 eZ4An1nkLrJJdzxY+ZaYarHLI1fJKudu =kDC3 -----END PGP SIGNATURE-----
participants (5)
-
Adam McGreggor -
Barry Warsaw -
Bob Puff@NLE -
Cristóbal Palmer -
Mark Sapiro