[Mailman-Users] erroneous tabs in long digest subject lines

Mark Sapiro msapiro at value.net
Wed May 18 06:25:07 CEST 2005


>Greetings.  On digests automatically generated by the mailman system, I'm
>finding that long subject lines are having tabs inserted into them in
>unexpected ways in the topic summary, often related to commas.  For example,
>on a message that had the subject line:
>Subject: this is a test of a longer, subject line, longer yet
>and which was formatted properly in the non-digest distribution
>of the message, the digest version looks like this:
>Today's Topics:
>   1.  this is a test of a longer, subject line,        longer yet
>      (testtest at vortex.com)
>In this case, a tab has appeared between the comma and "longer".  Also,
>there are some inconsistencies in the way that Subjects folded onto a second
>line are indented in the topics summary (i.e., the second line of different
>subjects in the "Today's Topic" may not all be indented exactly the same
>amount), but the erroneous tab is a bigger problem and is the one I'd really
>like to nab, since it is significantly breaks the format and is happening so
>frequently.  I've looked at the ToDigest and Utils.wrap routines and don't
>see an obvious cause.

That's because Mailman isn't doing it. It's the MUA that composed the
original message. The differences/discrepancies/whatever are the
result of folding the Subject: header across multiple lines and an
unclear standard covering how to do so.

The old standard RFC 822 in section 3.1.1 said "The general rule is
that wherever  there may  be  linear-white-space  (NOT  simply 
LWSP-chars), a CRLF immediately followed by AT LEAST one LWSP-char may
instead  be inserted." and "The process of moving  from  this  folded
multiple-line representation  of a header field to its single line
representation is called "unfolding".  Unfolding  is  accomplished  by
regarding   CRLF   immediately  followed  by  a  LWSP-char  as
equivalent to the LWSP-char."

This seems to say that extra indentation can be inserted when folding,
but not removed when unfolding.

The current RFC 2822 section 2.2.3 is more clear. It says in part, "The
general rule is that wherever this standard allows for folding white
space (not simply WSP characters), a CRLF may be inserted before any

Thus some MUAs based on RFC 822 insert multiple spaces or a tab when
creating a folded subject line and software such as Mailman following
either standard doesn't remove them.

Mark Sapiro <msapiro at value.net>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

More information about the Mailman-Users mailing list