Formatting incoming mail in Mailman
I just sent a message to my list from a webmail account, and got an empty text body in what was sent out from the list. Sent a second message to a user account which I read with elm to see what's in the message body.
This is all of what comes through sendmail into the mbox:
<DIV style="font-family:Arial, sans-serif; font-size:10pt;"><FONT size="2"><SPAN style="font-family: Arial,sans-serif;">send a line or two of text to see what t his sob sends.<BR></SPAN></FONT></DIV>
Note that it's HTML, but with no mime type indicator of any sort.
I have Mailman content filtering for this list set to filter the content (yes), remove attachments that don't match standard mime types, collapse alternatives, convert html to text, and discard messages meeting the filtering rules.
Up to now, we've used demime in the sendmail "post" alias pipe to demunge non-plaintext posts, but I'd like to switch to Mailman internal demunging.
The above webmail formatting is the default new subscribers to that ISP get. We need to able to handle posts from total novices who send such stuff.
Current Mailman is 2.1.9.
What do we do?
Hank
Hank van Cleef wrote:
I just sent a message to my list from a webmail account, and got an empty text body in what was sent out from the list. Sent a second message to a user account which I read with elm to see what's in the message body.
This is all of what comes through sendmail into the mbox:
<DIV style="font-family:Arial, sans-serif; font-size:10pt;"><FONT size="2"><SPAN style="font-family: Arial,sans-serif;">send a line or two of text to see what t his sob sends.<BR></SPAN></FONT></DIV>
Really? No headers at all? Or is this just some elm view of the message body?
Look at the mbox with vi and look at everything from the "From " separator through the end of the message.
Note that it's HTML, but with no mime type indicator of any sort.
I have Mailman content filtering for this list set to filter the content (yes), remove attachments that don't match standard mime types, collapse alternatives, convert html to text, and discard messages meeting the filtering rules.
Presumably you are using pass_mime_types to remove the 'non-matching' types. What's in pass_mime_types?
What is your HTML_TO_PLAINTEXT_COMMAND setting? Is this the problem described at <http://mail.python.org/pipermail/mailman-users/2003-January/025373.html>?
Up to now, we've used demime in the sendmail "post" alias pipe to demunge non-plaintext posts, but I'd like to switch to Mailman internal demunging.
The above webmail formatting is the default new subscribers to that ISP get. We need to able to handle posts from total novices who send such stuff.
Current Mailman is 2.1.9.
What do we do?
Show us the complete, raw message and we may be able to help.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
The esteemed Mark Sapiro has said:
Hank van Cleef wrote:
I just sent a message to my list from a webmail account, and got an empty text body in what was sent out from the list. Sent a second message to a user account which I read with elm to see what's in the message body.
This is all of what comes through sendmail into the mbox:
<DIV style="font-family:Arial, sans-serif; font-size:10pt;"><FONT size="2"><SPAN style="font-family: Arial,sans-serif;">send a line or two of text to see what t his sob sends.<BR></SPAN></FONT></DIV>
Really? No headers at all? Or is this just some elm view of the message body?
Look at the mbox with vi and look at everything from the "From " separator through the end of the message.
Here's the whole message from the spool file:
From vancleef@wyoming.com Sat Feb 13 07:57:07 2010 Return-Path: <vancleef@wyoming.com> Received: from omta0109.mta.everyone.net (imta-38.everyone.net [216.200.145.38]) by julie.lostwells.net (8.13.8+Sun/8.13.8) with ESMTP id o1DEv6c8020571 for <vancleef@lostwells.net>; Sat, 13 Feb 2010 07:57:06 -0700 (MST) Received: from dm0104.mta.everyone.net (sj1-slb03-gw2 [172.16.1.96]) by omta0109.mta.everyone.net (Postfix) with ESMTP id 33D71288402 for <vancleef@lostwells.net>; Sat, 13 Feb 2010 06:57:06 -0800 (PST) X-Eon-Dm: dm0104 Received: by resin18.mta.everyone.net (EON-PICKUP) id resin18.4b721e04.106b6; Sat, 13 Feb 2010 06:57:06 -0800 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Message-Id: <20100213065706.B5C70750@resin18.mta.everyone.net> Date: Sat, 13 Feb 2010 06:57:06 -0800 From: <vancleef@wyoming.com> To: <vancleef@lostwells.net> Subject: See what's in this mail X-Eon-Sig: AQLk58tLdr3CQFXGBwEAAAAB,10e73ece43724ae87d84f3c7c2c3384c X-Originating-Ip: 216.67.170.236 Content-Length: 199
<DIV style="font-family:Arial, sans-serif; font-size:10pt;"><FONT size="2"><SPAN style="font-family: Arial,sans-serif;">send a line or two of text to see what this sob sends.<BR></SPAN></FONT></DIV>
(EOM implicit here)
Note that it's HTML, but with no mime type indicator of any sort.
I have Mailman content filtering for this list set to filter the content (yes), remove attachments that don't match standard mime types, collapse alternatives, convert html to text, and discard messages meeting the filtering rules.
Presumably you are using pass_mime_types to remove the 'non-matching' types. What's in pass_mime_types?
(copied from the options page): mixed alternative text/plain text/html
What is your HTML_TO_PLAINTEXT_COMMAND setting? Is this the problem described at <http://mail.python.org/pipermail/mailman-users/2003-January/025373.html>?
This may be part of the problem, as the line in Defaults.py has not been overridden to point to where I have a lynx on the system.
I've added to mm_cfg.py the line: HTML_TO_PLAIN_TEXT_COMMAND = '/usr/local/bin/lynx -dump %(filename)s'
which is the Defaults.py line change to point to the correct directory.
Verification: julie:vancleef:$ which lynx /usr/local/bin/lynx
julie:vancleef:$ lynx -version Lynx Version 2.8.5rel.1 (04 Feb 2004) Built on solaris2.9 Dec 20 2006 19:54:22
julie:vancleef:$ ls -l /usr/local/bin/lynx -rwxr-xr-x 1 root other 1892964 Dec 20 2006 /usr/local/bin/lynx
Checking logs, I don't get a grep hit on "lynx" in either syslog or the Mailman logs. Checking mailman/logs/error, I have a line Feb 13 07:52:07 2010 (296) HTML->text/plain error: 256 This matches the timestamp on the incoming webmail post.
That build of lynx isn't used anywhere else on my system. Are there any other install considerations needed? The e-mail you refenced flagged lynx as a potential problem, but seems to be primarily about a Linux distro and /tmp.
I've restarted Mailman with the new line in mm_cfg.py, and will retest.
Hank
Hank van Cleef wrote:
The esteemed Mark Sapiro has said:
Presumably you are using pass_mime_types to remove the 'non-matching' types. What's in pass_mime_types?
(copied from the options page): mixed alternative text/plain text/html
The above is not correct. If you have a single word as a pass_mime_types (or filter_mime_types) entry, it is the main type, not the subtype. So, if you want to pass multipart/mixed and multipart/alternative, that's what you need to put.
I recommend however just putting
multipart text/plain text/html
in pass_mime_types. These, together with collapse_alternatives = Yes and convert_html_to_plaintext = Yes will ensure only plain text reaches the list and will also accept the plain text from messages that have a structure like
multipart/related multipart/alternative text/plain text/html image/xxx
which are produced by some Microsoft MUAs and possibly others. If you don't pass multipart/related, that entire message will be filtered. If you do pass all multipart and collapse alternatives, only the text/plain part goes to the list.
Yow might also consider adding message/rfc822 to pass_mime_types if you want to accept plain text or converted html from an attached message.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
multipart/related multipart/alternative text/plain text/html image/xxx
which are produced by some Microsoft MUAs and possibly others. If you don't pass multipart/related, that entire message will be filtered. If you do pass all multipart and collapse alternatives, only the text/plain part goes to the list.
This is true; however, on my lists, multipart/related is a very strong indication of image spam (using CSS to position the image over the innocuous text).
participants (3)
-
Hank van Cleef
-
Mark Sapiro
-
Stephen J. Turnbull