Mailman brokenly base64-encoding all messages
I'm using Mailman 2.1.22 packaged by Ubuntu 16.10.
It appears that on one (but I think not all) of my mailing lists, Mailman is base64-encoding every single message. Yes, including ones that 100% definitely contain only ASCII characters. Does anyone know why Mailman would be doing this?
You might say that base64-encoding all messages shouldn't be a problem, and you'd be partly right. However there is also another problem: Mailman is getting the line break encoding wrong in its base64-encoded messages - by which I mean the encoded representation of line breaks inside the base64 data. As per RFC 2045 s6.8: "line breaks must be converted into CRLF sequences prior to base64 encoding" but Mailman is outputting just LF.
The latter problem definitely appears to be a bug in Mailman, or perhaps the Python 'email' package. The former seems likely to be a configuration issue, but it's not obvious to me where.
On 03/30/2017 02:02 AM, Jon Ribbens wrote:
I'm using Mailman 2.1.22 packaged by Ubuntu 16.10.
It appears that on one (but I think not all) of my mailing lists, Mailman is base64-encoding every single message. Yes, including ones that 100% definitely contain only ASCII characters. Does anyone know why Mailman would be doing this?
Yes. First, Debian has changed their Mailman package on which Ubuntu is based to make UTF-8 Mailman's character set for all preferred languages.
This combines with the fact that the Python email library base64 encodes utf-8 message bodies.
You might say that base64-encoding all messages shouldn't be a problem, and you'd be partly right. However there is also another problem: Mailman is getting the line break encoding wrong in its base64-encoded messages - by which I mean the encoded representation of line breaks inside the base64 data. As per RFC 2045 s6.8: "line breaks must be converted into CRLF sequences prior to base64 encoding" but Mailman is outputting just LF.
I believe that RFC 2045 s6.8 refers back to canonical form as discussed in sections 6.5 and 6.6 and RFC 2049 sec 4. While it is arguable that this requires all plain text to use CRLF line delimiters regardless of encoding, I think common practice is to use CRLF only "on the wire" and not in base64 or quoted-printable encodings.
The latter problem definitely appears to be a bug in Mailman, or perhaps the Python 'email' package.
If it is a bug, it is in the Python email library, not Mailman.
The former seems likely to be a configuration issue, but it's not obvious to me where.
To change the former, you can put
add_language('en', 'English (USA)', 'us-ascii')
in mm_cfg.py.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Thu, Mar 30, 2017 at 08:35:54AM -0700, Mark Sapiro wrote:
I believe that RFC 2045 s6.8 refers back to canonical form as discussed in sections 6.5 and 6.6 and RFC 2049 sec 4. While it is arguable that this requires all plain text to use CRLF line delimiters regardless of encoding, I think common practice is to use CRLF only "on the wire" and not in base64 or quoted-printable encodings.
RFC 2045 s6.8 explicitly says that text line breaks "must" be converted to CRLF before base64 encoding, and that this is regardless of whether it's canonical form or not.
If it is a bug, it is in the Python email library, not Mailman.
OK I'll look into reporting it on the Python bug tracker.
To change the former, you can put
add_language('en', 'English (USA)', 'us-ascii')
in mm_cfg.py.
I've done this (in /etc/mailman/mm_cfg.py) and then done systemctl restart mailman and it's made no difference. Is there anything else I need to do also?
(I'm testing it by getting it to reply to a 'help' request.)
On 03/30/2017 08:58 AM, Jon Ribbens wrote:
On Thu, Mar 30, 2017 at 08:35:54AM -0700, Mark Sapiro wrote:
To change the former, you can put
add_language('en', 'English (USA)', 'us-ascii')
in mm_cfg.py.
I've done this (in /etc/mailman/mm_cfg.py) and then done systemctl restart mailman and it's made no difference. Is there anything else I need to do also?
Sorry, I forgot. Debian's package ignores any charset argument you put in add_language.
You need to put
LC_DESCRIPTIONS['en'] = ('English (USA)', 'us-ascii', 'ltr')
in mm_cfg.py. I think that will work.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Thu, Mar 30, 2017 at 10:22:08AM -0700, Mark Sapiro wrote:
Sorry, I forgot. Debian's package ignores any charset argument you put in add_language.
You need to put
LC_DESCRIPTIONS['en'] = ('English (USA)', 'us-ascii', 'ltr')
in mm_cfg.py. I think that will work.
Awesome, that does indeed appear to work perfectly. Thank you. Shouldn't this be a FAQ? I looked there first but couldn't find anything relevant.
On 03/30/2017 04:32 PM, Jon Ribbens wrote:
On Thu, Mar 30, 2017 at 10:22:08AM -0700, Mark Sapiro wrote:
Sorry, I forgot. Debian's package ignores any charset argument you put in add_language.
You need to put
LC_DESCRIPTIONS['en'] = ('English (USA)', 'us-ascii', 'ltr')
in mm_cfg.py. I think that will work.
Awesome, that does indeed appear to work perfectly. Thank you. Shouldn't this be a FAQ? I looked there first but couldn't find anything relevant.
I'll think about a FAQ. This whole Debian/Ubuntu utf-8 encoding issue has been a pain from the beginning for those of us (me) who've had to pick up the pieces after things broke.
See for example the comment thread at https://bugs.launchpad.net/mailman/+bug/1462755 and the archived thread linked from comment #6.
Also, if you're interested, see the first paragraph at https://wiki.list.org/ about obtaining write access to the wiki if you'd like to write something.
-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Jon Ribbens
-
Mark Sapiro