Hi,
we are encountering the following "problem" with Mailman 2.1.3:
When a subscriber sends a message to one of our lists with an UTF-8 encoded body and CTE "8bit", Mailman obviously recodes the body from "8bit" to "base64" before distributing the mail to the subscribers of the respective list.
Although this is perfectly legal it does not really make sense and also creates sometimes hassles for users with MUAs which have difficulties with base64 encoded bodies. A qp-encoding would IMO be more appropriate for text/plain bodies, but I would prefer if Mailman would not recode the body at all.
Three strange things in this context:
In the Pipermail archive, the message appears in its original "8bit" format.
We are using the mail<=>news facility of Mailman, and the newsgroup users are also receiving the message in the original "8bit" format.
We have one (non-public) admin mailing list where Mailman does *not* recode the body. This particular list is also *not* gated to a newsgroup (and this is the only difference to all of the other lists we are currently aware of).
Any idea or suggestion what might be the reason for the recoding and how we could avoid it?
Michael
On 18 Jul 2004 15:55:00 +0200, Michael Heydekamp <my@freexp.de> wrote:
Hi,
we are encountering the following "problem" with Mailman 2.1.3:
When a subscriber sends a message to one of our lists with an UTF-8 encoded body and CTE "8bit", Mailman obviously recodes the body from "8bit" to "base64" before distributing the mail to the subscribers of the respective list.
Although this is perfectly legal it does not really make sense and also creates sometimes hassles for users with MUAs which have difficulties with base64 encoded bodies. A qp-encoding would IMO be more appropriate for text/plain bodies, but I would prefer if Mailman would not recode the body at all.
...
Yes, I find it quite annoying when a listserver performs that type of
recoding. I find it even worse when it garbles the content because it does
not sunderstand the encoding. This is mainly a problem with people who
reply to digests.
I would prefer to have a configuration option where I could set the
outgoing encoding to utf-8 with 8bit cte. Others may want to cater to
special local needs. In any case, the digester needs to understand more
about encoding. It cannot just pass everything through as now, since it is
converting headers to body text. It needs to understand enough about
MIME-headers to get subjects and names converted into utf-8, and it needs
to understand Punycode to convert international domain names.
-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Björn Vermo <bv@opera.no> wrote on 19.07.04:
On 18 Jul 2004 15:55:00 +0200, Michael Heydekamp <my@freexp.de> wrote:
we are encountering the following "problem" with Mailman 2.1.3:
When a subscriber sends a message to one of our lists with an UTF-8 encoded body and CTE "8bit", Mailman obviously recodes the body from "8bit" to "base64" before distributing the mail to the subscribers of the respective list.
Although this is perfectly legal it does not really make sense and also creates sometimes hassles for users with MUAs which have difficulties with base64 encoded bodies. A qp-encoding would IMO be more appropriate for text/plain bodies, but I would prefer if Mailman would not recode the body at all.
Yes, I find it quite annoying when a listserver performs that type of recoding.
Thanks for the support. :)
I find it even worse when it garbles the content because it does not sunderstand the encoding. This is mainly a problem with people who reply to digests.
In our case digesting is not involved at all and the recoded and distributed mail is technically correct. Nonetheless we don't want to distribute base64 encoded bodies if the original mail was correctly produced and sent as 8bit.
It's not clear to me if this behaviour is by design (which I doubt because due to some reason we don't see it in our non-public admin list).
If somebody would just answer how to avoid the recoding...
I would prefer to have a configuration option where I could set the outgoing encoding to utf-8 with 8bit cte. Others may want to cater to special local needs. In any case, the digester needs to understand more about encoding. It cannot just pass everything through as now, since it is converting headers to body text.
I've never played with digests, but doesn't Mailman produce digests of type "multipart/digest" with subparts of type "message/rfc822" (see RFC2046)?
Michael
There is one serious problem with sending out 8 bit mail. If the receiving MTA does not advertise 8 bit MIME you have to transcode in the MTA. A few MTAs have decided that buggering about with message bodies like this is not in their purview and therefore do not advertise 8 bit MIME by default since then any mail they relay might then have to be transcoded to 7 bit. A case in point here is exim which does not do MIME transcoding (or body hacking in general).
If you do have an 8 bit message, and are trying to send to a non 8 bit recipient MTA, and will not transcode, you then have a problem. You can then either just send it (sod you, take this anyway) or bounce it.
For this reason a case could be made for all mail initiators, including MLMs, to encode for 7 bit transport.
Its a shame SMTP wasn't originally defined as an 8 bit channel.
Nigel.
-- [ Nigel Metheringham Nigel.Metheringham@InTechnology.co.uk ] [ - Comments in this message are my own and not ITO opinion/policy - ]
Nigel Metheringham <Nigel.Metheringham@dev.intechnology.co.uk> wrote on 19.07.04:
There is one serious problem with sending out 8 bit mail. [...]
I'm aware of all of these problems but that still doesn't explain ...
a) why Mailman recodes the UTF-8 body to base64 rather than qp (which would be the recommended and much better option for text/plain). I suspect that it must have to do something with UTF-8, but I still have to check what Mailman is doing with an 8bit mail in say ISO-8859-1.
b) why Mailman does not always do that (as I mentioned, we are not seeing this behaviour with our non-public admin-list).
So there must be a way (or a config setting) to prevent Mailman from recoding to base64 - if somebody could just tell me how, pleeez? :)
A case in point here is exim which does not do MIME transcoding (or body hacking in general).
We are indeed using Exim but that applies also to the admin-list. So why the heck are mails to this particular list distributed in 8bit?
For this reason a case could be made for all mail initiators, including MLMs, to encode for 7 bit transport.
Right. That's what I'm doing anyway, but this doesn't apply to all of our subscribers.
Michael
"Michael" == Michael Heydekamp <my@freexp.de> writes:
Michael> I'm aware of all of these problems but that still doesn't
Michael> explain ...
Michael> a) why Mailman recodes the UTF-8 body to base64 rather
Michael> than qp (which would be the recommended and much better
Michael> option for text/plain).
Probably because the people who wrote the code were Arabic or Korean speakers, where pretty much everything but citation prefixes and line endings is 8-bit. QP is no more recommended than BASE64. QP is preferred if-and-only-if the body would be basically readable in QP, which of course depends on the content.
So choosing intelligently would involve a count of the entire part to be encoded. Yuck.
-- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software.
At 12:40 PM +0900 2004-07-22, Stephen J. Turnbull wrote:
So choosing intelligently would involve a count of the entire part to be encoded. Yuck.
Which is not far from what is done by sendmail when deciding to
convert 8bit to either QP or Base64. It doesn't look at the entire bodypart, but it does look at a good-size chunk of the first part, and compare how many characters have the high bit set versus those that don't.
If only a very few characters have the high bit set, then it
chooses QP. If many of the characters have the high bit set, then it chooses Base64.
I'm still not convinced that Mailman is actually doing the
conversion here, but there are ways to try to address this subject.
As is typical for most open-source projects, patches are welcomed.
-- Brad Knowles, <brad.knowles@skynet.be>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.
Brad Knowles <brad.knowles@skynet.be> wrote on 22.07.04:
At 12:40 PM +0900 2004-07-22, Stephen J. Turnbull wrote:
So choosing intelligently would involve a count of the entire part to be encoded. Yuck.
[...]
I'm still not convinced that Mailman is actually doing the
conversion here, [...]
I'm suprised that no developer in this developer's list does confirm or deny that (or gives a hint where to look).
Michael
At 3:49 PM +0200 2004-07-19, Michael Heydekamp wrote:
It's not clear to me if this behaviour is by design (which I doubt because due to some reason we don't see it in our non-public admin list).
I've been quiet so far on this issue, because I'm not sure that I
can help. However, I'm still not convinced that it's Mailman which is doing the translation, as opposed to your MTA.
-- Brad Knowles, <brad.knowles@skynet.be>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.
Brad Knowles <brad.knowles@skynet.be> wrote on 19.07.04:
At 3:49 PM +0200 2004-07-19, Michael Heydekamp wrote:
It's not clear to me if this behaviour is by design (which I doubt because due to some reason we don't see it in our non-public admin list).
I've been quiet so far on this issue, because I'm not sure that I
can help.
This seems to be my destiny whenever I'm raising a question here. ;)
However, I'm still not convinced that it's Mailman which is doing the translation, as opposed to your MTA.
Hmm, to my best knowledge Exim is known to not mangle the body at all.
Michael
At 6:14 PM +0200 2004-07-19, Michael Heydekamp wrote:
However, I'm still not convinced that it's Mailman which is doing the translation, as opposed to your MTA.
Hmm, to my best knowledge Exim is known to not mangle the body at all.
In which case, Exim could be a potential reason that might cause
unknowing recipients to get unceremoniously tossed off the list.
I can't speak for other MTAs, but I know that sendmail can do
translation, and will do so by default under certain circumstances (e.g., it has 8-bit input and the output is not indicated as being 8-bit clean). I believe that the same can be said for postfix, and I would have said the same for Exim.
This seems to me to be a more MTA-like thing to do, whereas I
would expect Mailman to just take whatever it's given and not perform any translation. Have you found actual code within Mailman to perform this translation?
-- Brad Knowles, <brad.knowles@skynet.be>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.
Brad Knowles <brad.knowles@skynet.be> wrote on 19.07.04:
At 6:14 PM +0200 2004-07-19, Michael Heydekamp wrote:
However, I'm still not convinced that it's Mailman which is doing the translation, as opposed to your MTA.
Hmm, to my best knowledge Exim is known to not mangle the body at all.
In which case, Exim could be a potential reason that might cause unknowing recipients to get unceremoniously tossed off the list.
Which is not really clear to me how that could happen (and which did not happen in real life here), but that's another issue.
I can't speak for other MTAs, but I know that sendmail can do translation, and will do so by default under certain circumstances (e.g., it has 8-bit input and the output is not indicated as being 8-bit clean). I believe that the same can be said for postfix, and I would have said the same for Exim.
Exim is 8bit clean, but AFAIK as an receiving MTA Exim is telling the sending MTA that it is not. Which causes the sending MTA to say "well, then I'll have to recode all 8bit mails". Exim does that because it does not do any recoding on its own. This is what I've been told by folks who know these things better than me.
And probably this is the reason why Mailman does also recode 8bit mails upon handing them to Exim (again, but not in the case of our non-public admin list, which is still strange).
And I wouldn't even complain if Mailman would use qp encoding rather than base64 encoding.
This seems to me to be a more MTA-like thing to do, whereas I
would expect Mailman to just take whatever it's given and not perform any translation. Have you found actual code within Mailman to perform this translation?
As this is the Mailman developer's list, I was hoping that the developers would confirm or deny that. ;) And probably point me at the right place in the code.
Where should I look?
Michael
participants (5)
-
Bjørn Vermo
-
Brad Knowles
-
Michael Heydekamp
-
Nigel Metheringham
-
Stephen J. Turnbull