Re: [Mailman-Developers] Use of tabs when folding header lines

At 3:20 PM -0800 2005-11-28, Nathan Herring wrote:
I am a member of a list which uses mailman 2.1.2, and I am experiencing strangeness in Outlook as a result of the modifications mailman performs on list posts.
Note that the latest released version of Mailman is 2.1.6, and
there have been a number of improvements made since 2.1.2. It is possible that whatever problem you are experiencing has already been fixed, but you should check the latest code.
Specifically, when it (re)folds the Subject: and/or Thread-Topic: headers, it replaces a space character in the original subject with a tab character.
Also keep in mind that Mailman uses a lot of built-in Python
routines for handling stuff, and some of those routines might include the handling and formatting of e-mail messages. You want to make sure that you're using the latest version of Python that is compatible with the version of Mailman you've got. Unfortunately, while the people working on Mailman tend to be pretty well aware of the various mail-related RFCs, the people writing the code and libraries within Python itself may not be.
You should at least try to check the code to make sure where the
fault lies.
Other than that, I can't provide any specific assistance to you,
but I would be very curious to know what the real problem is, and where the fault lies.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.

On 12/2/05 1:42:35 AM, "Brad Knowles" <brad@stop.mail-abuse.org> wrote:
At 3:20 PM -0800 2005-11-28, Nathan Herring wrote:
I am a member of a list which uses mailman 2.1.2, and I am experiencing strangeness in Outlook as a result of the modifications mailman performs on list posts.
Note that the latest released version of Mailman is 2.1.6, and there have been a number of improvements made since 2.1.2. It is possible that whatever problem you are experiencing has already been fixed, but you should check the latest code.
I admit to not having installed it myself, nor looked at any code (a behavior which is frowned upon by my current employer). However, I did review the archive of this list back through to the 2.1.2 relase mail, and then went through the bug reports without seeing any mention of this problem or a fix that would be relevant.
Specifically, when it (re)folds the Subject: and/or Thread-Topic: headers, it replaces a space character in the original subject with a tab character.
Also keep in mind that Mailman uses a lot of built-in Python routines for handling stuff, and some of those routines might include the handling and formatting of e-mail messages. You want to make sure that you're using the latest version of Python that is compatible with the version of Mailman you've got. Unfortunately, while the people working on Mailman tend to be pretty well aware of the various mail-related RFCs, the people writing the code and libraries within Python itself may not be.
I'll try to find out what versions of the OS and Python are running on my list administratrix's server, in case that makes a difference.
You should at least try to check the code to make sure where the fault lies.
Aye, there's the rub. If I were permitted, I'd have done it already. :/
Other than that, I can't provide any specific assistance to you, but I would be very curious to know what the real problem is, and where the fault lies.
I've increased the subject length of this mail, and removed the "[Mailman-developers]" attribution, so we can see firsthand whether or not this list, which purports to be running 2.1.6, and I presume is running a suitable version of Python, has the same issue.
-nh

At 2:22 AM -0800 2005-12-02, Nathan Herring wrote:
You should at least try to check the code to make sure where the fault lies.
Aye, there's the rub. If I were permitted, I'd have done it already. :/
The code for Mailman is publicly available for download (see
<http://www.list.org/download.html>), and the code for Python is available (see <http://www.python.org/download/>). Most of the people on this list should be capable of checking the code for themselves.
I looked through the various routines in Mailman, and found a
case in CookHeaders.py where the white space character is set to tab, but I don't know enough about the internal implementations of the other routines or the class, so I can't be sure how that would affect this particular situation.
I've increased the subject length of this mail, and removed the "[Mailman-developers]" attribution, so we can see firsthand whether or not this list, which purports to be running 2.1.6, and I presume is running a suitable version of Python, has the same issue.
We are definitely running Mailman 2.1.6, with Python version 2.3.4.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.

Nathan Herring wrote:
I've increased the subject length of this mail, and removed the "[Mailman-developers]" attribution, so we can see firsthand whether or not this list, which purports to be running 2.1.6, and I presume is running a suitable version of Python, has the same issue.
and Mark Sapiro previously wrote:
Brad is correct here. Mailman represents messages as instances of the Python email.Message.Message class and is at the mercy of the methods in that class as far header folding and unfolding are concerned. And, for the record, even the Python 2.4.2 email module folds with a <tab>.
Clearly what I said above is not the whole story as Nathan's message that I received from the list had the Subject: folded with a <space>, yet other Mailman related headers in the message, namely List-Unsubscribe: and List-Subscribe: are folded with <tab>.
Upon closer examination, I see that the email.Header.Header class supports a continuation_ws argument which as Brad notes is used in CookHeaders. The prefix_subject function in CookHeaders attempts to determine the continuation-ws character from the existing header by looking at the first character of the first continuation line of the original subject header. If the header isn't continued or if the first character of the first continuation is not a <space> or <tab>, it defaults to a <tab>.
Thus it will try to preserve the continuation-ws of an already folded incoming subject, but will default to <tab>. Thus if the incoming subject is not folded, but addition of the prefix lengthens it sufficiently so it folds, it will be continued with a <tab>.
BTW, this code has been in CookHeaders since 2.1.1, so I don't think anything will change in this respect between 2.1.2 and 2.1.6.
It does seem however, that given RFC 2822, the default continuation-ws character in CookHeaders should be <space> and not <tab>.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On 12/2/05 6:13:20 PM, "Mark Sapiro" <msapiro@value.net> wrote:
Clearly what I said above is not the whole story as Nathan's message that I received from the list had the Subject: folded with a <space>, yet other Mailman related headers in the message, namely List-Unsubscribe: and List-Subscribe: are folded with <tab>.
The other interesting one from an Outlook or Entourage perspective is the Thread-Topic, which ends up having a tab inserted into it.
Upon closer examination, I see that the email.Header.Header class supports a continuation_ws argument which as Brad notes is used in CookHeaders. The prefix_subject function in CookHeaders attempts to determine the continuation-ws character from the existing header by looking at the first character of the first continuation line of the original subject header. If the header isn't continued or if the first character of the first continuation is not a <space> or <tab>, it defaults to a <tab>.
Scripts are another matter than source code, so I took a look.
Thus it will try to preserve the continuation-ws of an already folded incoming subject, but will default to <tab>. Thus if the incoming subject is not folded, but addition of the prefix lengthens it sufficiently so it folds, it will be continued with a <tab>.
It seems that for other headers, like Thread-Topic, even regularly folded items get its folding space turned into a folding tab.
BTW, this code has been in CookHeaders since 2.1.1, so I don't think anything will change in this respect between 2.1.2 and 2.1.6.
It does seem however, that given RFC 2822, the default continuation-ws character in CookHeaders should be <space> and not <tab>.
From my reading, it seems that email.Header doesn't preserve the FWS in the original header as it should. It would seem that the only time the continuation-ws parameter should be used is if there are sets of characters that need to be turned into multiple adjacent RFC 2047 encoded-words, as that FWS is not considered to be logically part of the header value, but merely an artifact of encoding. If email.Header did the correct preservation, then it would not matter whether you passed in <space> or <tab>.
However, it is significantly more likely that you'd find a space in a header (like the Subject) than a tab, so I'd concur with the suggestion about using space as a continuation-ws character. Or, get a fix from Python.
And now, off to the Python lists...
-nh

Nathan Herring wrote:
On 12/2/05 6:13:20 PM, "Mark Sapiro" <msapiro@value.net> wrote:
<snip>
The other interesting one from an Outlook or Entourage perspective is the Thread-Topic, which ends up having a tab inserted into it.
Yes, and this is significant in that Thread-Topic is an MUA (in your case Microsoft-Entourage) header that is completely untouched by Mailman.
Therefore, it would appear simply instantiating an email.Message.Message object and later writing it out is sufficient to cause the header continuation white space to change from <space> to <tab>.
Thus it will try to preserve the continuation-ws of an already folded incoming subject, but will default to <tab>. Thus if the incoming subject is not folded, but addition of the prefix lengthens it sufficiently so it folds, it will be continued with a <tab>.
It seems that for other headers, like Thread-Topic, even regularly folded items get its folding space turned into a folding tab.
Yes. The logic of trying to determine and preserve the continuation white space of an already continued, incoming header is specific to Mailman's manipulation of Subject: headers in CookHeaders. It is not done elsewhere in Mailman.
From my reading, it seems that email.Header doesn't preserve the FWS in the original header as it should. It would seem that the only time the continuation-ws parameter should be used is if there are sets of characters that need to be turned into multiple adjacent RFC 2047 encoded-words, as that FWS is not considered to be logically part of the header value, but merely an artifact of encoding. If email.Header did the correct preservation, then it would not matter whether you passed in <space> or <tab>.
Actually, the email.Header.Header class is a constructor for Header objects. It does not take a 'header' as an input argument. It makes a class instance from a non-continued string or a list of (data, charset) pairs to be RFC 2047 encoded. See the Python library reference, sec. 12.2.5. The resulting Header instance is continued as necessary using the continuation-ws argument as the leading white space on continued lines. Within the class, continuation-ws defaults to <space>, not <tab>.
There are really two issues here. The first is that CookHeaders manipulates Subject: headers as Header class objects and unless the incoming Subject: header is already continued with a <space>, it specifies a <tab> for the continuation-ws character. This is strictly a Mailman issue.
The other issue is that the Python email.Parser class API parses message headers into strings, not Header class instances, and the methods for flattening messages continue long header strings with <tab>. This is a Python email library issue. To see it in action, just do the following in an interactive Python session.
import email from cStringIO import StringIO from email.Generator import Generator x = email.message_from_string('Header: a long string of words that will be ultimately continued because it\n is too long for one line') x['Subject'] = 'Some other long line which we build here and stick into a header to see what happens' fp = StringIO() g = Generator(fp, maxheaderlen=60) g.flatten(x) text = fp.getvalue() text 'Header: a long string of words that will be ultimately\n\tcontinued because it is too long for one line\nSubject: Some other long line which we build here and stick\n\tinto a header to see what happens\n\n'
(Lines 4, 5 and 11 above are each one long line in the original, although they may get wrapped in emailing.)
Note that the original Header: was continued with a <space>. The flatten method properly unfolds it but then refolds it with a <tab>. Likewise it folds the Subject: with a <tab>.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Brad Knowles wrote:
At 3:20 PM -0800 2005-11-28, Nathan Herring wrote:
Specifically, when it (re)folds the Subject: and/or Thread-Topic: headers, it replaces a space character in the original subject with a tab character.
Also keep in mind that Mailman uses a lot of built-in Python routines for handling stuff, and some of those routines might include the handling and formatting of e-mail messages. You want to make sure that you're using the latest version of Python that is compatible with the version of Mailman you've got. Unfortunately, while the people working on Mailman tend to be pretty well aware of the various mail-related RFCs, the people writing the code and libraries within Python itself may not be.
Brad is correct here. Mailman represents messages as instances of the Python email.Message.Message class and is at the mercy of the methods in that class as far header folding and unfolding are concerned. And, for the record, even the Python 2.4.2 email module folds with a <tab>.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (3)
-
Brad Knowles
-
Mark Sapiro
-
Nathan Herring