Public bug reported:
Hi, When mailing list is configured with:
preferred_language = 'ja' available_languages = ['en', 'ja'] send_goodbye_msg = 0
And mailman receives confirmation email for unsubscription with multipart message encoded in different encoding than euc-jp, then mailmans command runner crashes with traceback:
Feb 10 18:08:00 2021 (1270) Uncaught runner exception: 'euc_jp' codec can't decode bytes in position 280-281: illegal multibyte sequence Feb 10 18:08:00 2021 (1270) Traceback (most recent call last): File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 119, in _oneloop self._onefile(msg, msgdata) File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 190, in _onefile keepqueued = self._dispose(mlist, msg, msgdata) File "/usr/lib/mailman/Mailman/Queue/CommandRunner.py", line 291, in _dispose res.send_response() File "/usr/lib/mailman/Mailman/Queue/CommandRunner.py", line 208, in send_response results = MIMEText(NL.join(encoded_resp), _charset=charset) File "/usr/lib64/python2.7/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/usr/lib64/python2.7/email/message.py", line 226, in set_payload self.set_charset(charset) File "/usr/lib64/python2.7/email/message.py", line 264, in set_charset self._payload = charset.body_encode(self._payload) File "/usr/lib64/python2.7/email/charset.py", line 390, in body_encode s = self.convert(s) File "/usr/lib64/python2.7/email/charset.py", line 273, in convert return unicode(s, self.input_codec).encode(self.output_codec) UnicodeDecodeError: 'euc_jp' codec can't decode bytes in position 280-281: illegal multibyte sequence
Here is an example of a mail which triggers this:
Date: Wed, 11 Nov 2020 14:13:16 +0900 Subject: Re: ###CONFIRM### From: <testuser2@###SERVER###> To: userlist-request@###SERVER###.com MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000663b1c05b3cdda78"
--000000000000663b1c05b3cdda78 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: base64
Also its maybe worth pointing out that setting RESPONSE_INCLUDE_LEVEL to <=1 mitigates this.
I've managed to fix this issue by decoding the message part with its original charset and then encoding it to preferred encoding of mailing list. I'm attaching my patch. Could you please consider merging it?
Thanks for any help you can provide.
** Affects: mailman Importance: Undecided Status: New
** Attachment added: "mailman-cmd-replies.patch" https://bugs.launchpad.net/bugs/1921682/+attachment/5481811/+files/mailman-c...
** Branch linked: lp:mailman/2.1
Thank you for the patch.
** Changed in: mailman Importance: Undecided => Medium
** Changed in: mailman Status: New => Fix Committed
** Changed in: mailman Milestone: None => 2.1.35
** Changed in: mailman Assignee: (unassigned) => Mark Sapiro (msapiro)
The current fix is attached. I initially committed the change without the 'errors=' arguments in the unicode and encode even though I knew better, and it only took 42 minutes after I installed it on mail.python.org to throw a UnicodeEncodeError on a confirmation on an English language list with non-ascii in the body.
** Patch added: "1921682.patch" https://bugs.launchpad.net/mailman/+bug/1921682/+attachment/5482961/+files/1...
Ah, sorry about that. Thanks for the new patch.
** Changed in: mailman Status: Fix Committed => Fix Released