Subject Lines Wrapped After Commas, (Like This?)

I've noticed that on several Mailman lists I'm running, subject lines sometimes get chopped after commas or semicolons. This doesn't always happen, though. It's not just a matter of things being chopped after the 78-or-so chars, either. Sometimes, it's after a single word.
Below are examples from a list I run on ethanol issues. The first batch are examples with commas that came through just fine. Below it are examples where the subject lines got chopped/wrapped prematurely. This always happens at commas or semicolons.
I'm running Mailman version "2.1.9.cp2" (control panel 2?).
Any guidance on how to stop this inappropriate subject line wrapping would be much appreciated.
Mike
*** THESE CAME THROUGH FINE ***
Subject: [Ethanol] Ethanol plant question,Palestine Il??
Subject: [Ethanol] Barrie, Ontario vs. proposed ethanol plant
Subject: [Ethanol] Jim Dunn vs."dumbest" PA ethanol plans. U.S. Taxpayers pay, Russians profit!
Subject: [Ethanol] Tufts' Grant Reid: ETHANOL, SNAKE OIL FOR THE 21ST CENTURY
Subject: [Ethanol] ABC, MSNBC, etc.: Not "green, " ethanol = more smog, cancer, death
*** THIS MAY HAVE BEEN CHOPPED DUE TO LENGTH ***
Subject: [Ethanol] UN warns: Biofuels risk farmers, food, environment, climate
*** THESE WERE CHOPPED AFTER COMMAS OR SEMICOLONS PREMATURELY ***
Subject: [Ethanol] Unlike U.S., Canada issues fines for railway ethanol spills
Subject: [Ethanol] Not Iraq, is 2008 U.S. election' about who profits most from ethanol?
Subject: [Ethanol] As in NY, Vinod Khosla's CA ethanol plans need "proper environmental review"
Subject: [Ethanol] Genetically Engineered Biofuels: The Next Big, Dangerous Hoax
Subject: [Ethanol] Wisconsin residents replace local officials, stop ethanol plant
Subject: [Ethanol] Dover, Wisconsin ciizens VOTE on "monster" ethanol plant April 3
Subject: [Ethanol] Cambria, Wisconsin CITIZENS vote on a Didion ethanol plant April 3
Subject: [Ethanol] PA officials snub NC's ethanol-backing, job-hunting Bill Martin
Subject: [Ethanol] Leaving E85 ethanol "mess" in NC, Bill Martin job-hunts in PA
Subject: [Ethanol] PA newspaper wins top award; blew whistle on secret ethanol study
Subject: [Ethanol] NY drops ethanol from energy agenda; favors wind, solar power

Mike Ewall wrote:
I've noticed that on several Mailman lists I'm running, subject lines sometimes get chopped after commas or semicolons.
"Chopped" seems to be the wrong term. Long subjects aren't "chopped"; they are "folded". See sec. 2.2.3 of RFC 2822 <http://www.faqs.org/rfcs/rfc2822.html> for an explanation of this process.
This doesn't always happen, though. It's not just a matter of things being chopped after the 78-or-so chars, either. Sometimes, it's after a single word.
Below are examples from a list I run on ethanol issues. The first batch are examples with commas that came through just fine. Below it are examples where the subject lines got chopped/wrapped prematurely. This always happens at commas or semicolons.
This is because the RFC recommends "folding SHOULD be limited to placing the CRLF at higher-level syntactic breaks" and the folding process is preferring to fold following a comma or semicolon if possible.
I'm running Mailman version "2.1.9.cp2" (control panel 2?).
This is a cPanel Mailman. See <http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq06.011.htp>. I'm not sure if the cPanel behavior is any different from standard Mailman in this respect. Both will fold long subjects or subjects that become long as a result of adding the subject_prefix. See below for more.
Any guidance on how to stop this inappropriate subject line wrapping would be much appreciated.
It's not inappropriate. It's a standard. The recipient's MUA should be properly unfolding the folded subject for display.
<snip>
*** THESE WERE CHOPPED AFTER COMMAS OR SEMICOLONS PREMATURELY ***
Subject: [Ethanol] Unlike U.S., Canada issues fines for railway ethanol spills
This could have been folded as
Subject: [Ethanol] Unlike U.S., Canada issues fines for railway ethanol spills
or as
Subject: [Ethanol] Unlike U.S., Canada issues fines for railway ethanol spills
I don't know if you'd like either of these better. I also don't know if folding at the comma in this case is unique to cPanel or not. In any case, it is done by the underlying Python email library, and all three of the above folded subjects should unfold to essentially the same thing (i.e., the MUA should remove the inserted <cr><lf><tab>)
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro writes:
case, it is done by the underlying Python email library, and all three of the above folded subjects should unfold to essentially the same thing (i.e., the MUA should remove the inserted <cr><lf><tab>)
AFAIK the standard implies but does not say that a single CRLF is to be inserted before "folding white space", and that the proper way to unfold is to simply remove the CRLF. It's not obvious to me that using a TAB here is a good idea; I would prefer a single space.
Again AFAIK, only in the case of abutting MIME encoded words does the standard explicitly say what to do with whitespace other than the CRLF. In that case, all intervening whitespace is to be removed (ie, any whitespace that should remain must be encoded in the encoded words). That's obviously not appropriate for receiving MUAs for non-RFC 2047 cases.
Unfortunately, there is no agreement whatsoever among MUAs on how to deal with the ambiguity in "the standard as MUA-writers understand it".
The bottom line is that I have a lot of sympathy with the OP; Eudora is not at all out of line with the degree of variation in the sample of MUAs I'm familiar with (many do not remove the CRLF, I suppose on the grounds that the sending MUA or user often intends a "sequence of lines" semantics). However, I don't see what Mailman can do except to pick a particular interpretation of the treatment of folding, and stick to it. The one used by Python's email library is certainly reasonable.
This is because the RFC recommends "folding SHOULD be limited to placing the CRLF at higher-level syntactic breaks" and the folding process is preferring to fold following a comma or semicolon if possible.
There's no support in the RFC for this. The "syntactic breaks" referred to are the RFC syntax, and the Subject field has none, not even comments. See RFC 2822 sections 2.2.1 and 3.6.5.
I would argue that Mailman's algorithm is bogus, as it violates POLA. At the very least the subject header should be presumed to be prose to be broken into approximately equal lines.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On May 21, 2007, at 11:04 PM, Stephen J. Turnbull wrote:
I would argue that Mailman's algorithm is bogus, as it violates POLA. At the very least the subject header should be presumed to be prose to be broken into approximately equal lines.
I agree that the current state isn't correct, but the right place to
fix this is in the email package, so the discussion really should be
moved to Python's email-sig.
- -Barry
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin)
iQCVAwUBRlLasnEjvBPtnXfVAQJhnQP/YpUl4uwpJr8rRQhG2eQxyYasREGkQyIU F/ReHfrZcG/GwaBEyeEkQpKyRa+vU4E75Tq/cyTVQoM/bNfVot16w8cm++u7MIi9 gSAXACpXoxdj5vPyyw4RVC8+Z1ADrJIK96XbNx9wVOl7OggnAiTjaaeUWW4vsuO8 l6QO2e/VHgU= =jfDY -----END PGP SIGNATURE-----

Barry Warsaw writes:
I agree that the current state isn't correct, but the right place to
fix this is in the email package, so the discussion really should be
moved to Python's email-sig.
I thought about that ... but I certainly hope that people who have opinions about this will join, because this is not a standards issue at root. It's about palliative care for those with sick MUAs.<wink> So what are the symptoms we need to palliate?

On 5/22/07, Stephen J. Turnbull wrote:
I thought about that ... but I certainly hope that people who have opinions about this will join, because this is not a standards issue at root. It's about palliative care for those with sick MUAs.<wink> So what are the symptoms we need to palliate?
The problem is that the people who can fix this problem are over on the list that Barry identified. Any discussion anywhere else is not likely to go anywhere, at least not as far as Python & Mailman are concerned.
So, get your experts to go have the appropriate discussion over there, or at least talk about the summary of the discussion.
That's your only real hope of getting anything done.
-- Brad Knowles <brad@shub-internet.org>, Consultant & Author LinkedIn Profile: <http://tinyurl.com/y8kpxu> Slides from Invited Talks: <http://tinyurl.com/tj6q4>
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

Brad Knowles writes:
The problem is that the people who can fix this problem are over on the list that Barry identified. Any discussion anywhere else is not likely to go anywhere, at least not as far as Python & Mailman are concerned.
So, get your experts to go have the appropriate discussion over there, or at least talk about the summary of the discussion.
This is not a problem that can be *defined* by experts, who are generally going to be using conformant, well-behaved software. Sometimes there will be issues, but they're less likely and Mailman Users are probably among the most likely to encounter large collections of users who all use software with the same issue.
Solutions should be proposed and discussed on email-sig, that's the appropriate place for that.
To all: Please, if you have an issue with Mailman's folding of headers, bring it up on the email-sig and/or the Python tracker. I don't know that something will be done soon, but those are the right places to collect reports. For tracker submissions, I suggest the phrase "email header folding" should be in the summary to make search easy (you won't find any yet, though).
You can see what's been happening with the email module recently here:
To submit a bug report, I think you need to register with Sourceforge. It's not hard, and AFAICT (I've been registered since the prerelease trial) it's never been a source of spam or anything like that.
participants (5)
-
Barry Warsaw
-
Brad Knowles
-
Mark Sapiro
-
Mike Ewall
-
Stephen J. Turnbull