Hello All,
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;bug=299032 describes a bug I'm suffering from: MIME encoded headers are re- encoded by mailman and spaces are added around each encoded character.
The affected mailman installation is 2.1.5 and maintained by a hoster, not myself. Therefore I would like to know what to tell the hoster.
I couldn't find the bug in the mailman (Launchpad) bug tracker so I wonder whether this bug is known to the developers and whether it's fixed now.
TIA,
Oliver
Oliver Betz, Muenchen
Oliver Betz wrote:
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;bug=299032 describes a bug I'm suffering from: MIME encoded headers are re- encoded by mailman and spaces are added around each encoded character.
If clients encoded the entire word instead of just single characters, this wouldn't be an issue. I believe most clients do this, and I think RFC 2047 effectively requires it.
The example problem header from the bug report is
Subject: abcd=?iso-8859-1?q?=e4?=ttt
RFC 2047 section 2 contains the following statement
IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's by an RFC 822 parser.
In the above subject header, the encoded-word is =?iso-8859-1?q?= and it is not an atom because atoms are delimited by white space.
Thus, this encoding is simply wrong, and it is the generating MUA that is at fault.
The affected mailman installation is 2.1.5 and maintained by a hoster, not myself. Therefore I would like to know what to tell the hoster.
I couldn't find the bug in the mailman (Launchpad) bug tracker so I wonder whether this bug is known to the developers and whether it's fixed now.
The 'problem' still exists in some configurations. The underlying issue is in the Python email.Header module. Current versions of the Python email package from Python 2.5 and later will not parse the above subject as containing an encoded-word at all, and a Mailman using one of these email packages will probably just output that Subject: string unchanged.
Older versions of the email package parse that Subject: into three parts; 'abcd', the encoded word '=?iso-8859-1?q?=e4?=' and 'ttt'. Then, when these pieces are reassembled the encoded word is properly delimited by spaces, thus introducing the extra spaces you see.
This is complicated by the fact that prior to 2.1.12, Mailman used it's own version of the email package instead of the underlying Python version. Thus, you will see this problem behavior with Mailman 2.1.11 and before regardless of the Python version and also with Mailman 2.1.12 if the underlying Python is 2.4.x, but not with Mailman 2.1.12 if the underlying Python is 2.5 or newer.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro wrote:
("MIME encoded headers are re-> >encoded by mailman and spaces are added around each encoded character.")
likely my description was based on incorrect assumptions.
If clients encoded the entire word instead of just single characters, this wouldn't be an issue. I believe most clients do this, and I think
I looked at other messages from this sender (received directly) and the subjects were encoded entirely so I thought this was also the case for the messages transported by mailman.
Now I asked him for a mail sent from the same environment as the list
messages were and indeed, the 8 bit characters were encoded separately (Microsoft Entourage) - stupid.
I told him that...
RFC 2047 effectively requires it.
...his mail client is likely broken and he will try Opera next time.
Thanks for the detailed explanation.
Oliver
Oliver Betz, Muenchen
participants (2)
-
Mark Sapiro
-
Oliver Betz