[Bug 1060951] [NEW] Bug getting i18n'ed attachment filenames (RFC2231)
Public bug reported: RFC 2231 allows filenames to have non-ascii characters. The get_filename() method in Python's Message class handles this by calling email.utils.collapse_rfc2231_value() at the end of get_filename. This method returns the filename in Unicode. This fails in Mailman because the mailman.email.message.Message class has a wrapper around get() and __getitem__() to return unicode headers. As a result, the collapse_rfc2231_value() tries to transforms into unicode an already unicode string, and I get the following exception: File "/usr/lib/python2.7/email/utils.py", line 319, in collapse_rfc2231_value return unicode(rawval, charset, errors) TypeError: decoding Unicode is not supported A possible solution to this would be to make Mailman's Message get_filename() method be more than just an exception-catching wrapper, and re-implement the original get_filename() method, inserting a conversion to str before calling collapse_rfc2231_value(). Does this make sense ? Any other idea for a possible solution ? ** Affects: mailman Importance: Undecided Status: New -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
** Tags added: mailman3 ** Also affects: mailman/2.1 Importance: Undecided Status: New ** Also affects: mailman/3.0 Importance: Undecided Status: New -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
See the TestMessageSubclass testcase I've added to the attached testsuite for a way to reproduce it. It's actually a little harder that I first thought, encoding the filename in the middle of the method is not enough. ** Attachment added: "extended testsuite" https://bugs.launchpad.net/mailman/+bug/1060951/+attachment/3368374/+files/t... -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
This works for me with Mailman 2.1.15 and email 4.0.1. Does it fail for you with Mailman 2.1.x? If so, what Mailman and email versions? [msapiro@MSAPIRO ~]$ python Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01) [GCC 4.3.4 20090804 (release) 1] on cygwin Type "help", "copyright", "credits" or "license" for more information.
import email email.__version__ '4.0.1' import sys sys.path.insert('/cygdrive/f/test-mailman/') from Mailman import Message msg = email.message_from_string("""Message-ID: <blah@example.com> ... Content-Type: multipart/mixed; boundary="------------050607040206050605060208" ... ... This is a multi-part message in MIME format. ... --------------050607040206050605060208 ... Content-Type: text/plain; charset=UTF-8 ... Content-Transfer-Encoding: quoted-printable ... ... Test message containing an attachment with an accented filename ... ... --------------050607040206050605060208 ... Content-Type: text/plain; charset=UTF-8; ... name="=?UTF-8?B?dG9kby1kw6lqZXVuZXIudHh0?=" ... Content-Transfer-Encoding: base64 ... Content-Disposition: attachment; ... filename*=UTF-8''%74%6F%64%6F%2D%64%C3%A9%6A%65%75%6E%65%72%2E%74%78%74 ... ... VmlhbmRlCk1lbnRoZQpQYWluClZpbgoKQ3Vpc2luZTogcHLDqXBhcmVyIGwnYXDDqXJvLCBj ... b3VwZXIgZXQgZmFpcmUgcmlzc29sZXIgbGVzIHBhdGF0ZXMsIGV0IGZhaXJlIGxlcyBjb29r ... aWVzCg== ... --------------050607040206050605060208-- ... """, Message.Message) msg From nobody Wed Oct 3 08:43:13 2012 Message-ID: <blah@example.com> Content-Type: multipart/mixed; boundary="------------050607040206050605060208"
This is a multi-part message in MIME format. --------------050607040206050605060208 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Test message containing an attachment with an accented filename --------------050607040206050605060208 Content-Type: text/plain; charset=UTF-8; name="=?UTF-8?B?dG9kby1kw6lqZXVuZXIudHh0?=" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*=UTF-8''%74%6F%64%6F%2D%64%C3%A9%6A%65%75%6E%65%72%2E%74%78%74 VmlhbmRlCk1lbnRoZQpQYWluClZpbgoKQ3Vpc2luZTogcHLDqXBhcmVyIGwnYXDDqXJvLCBj b3VwZXIgZXQgZmFpcmUgcmlzc29sZXIgbGVzIHBhdGF0ZXMsIGV0IGZhaXJlIGxlcyBjb29r aWVzCg== --------------050607040206050605060208--
att = msg.get_payload()[1] att From nobody Wed Oct 3 08:43:44 2012 Content-Type: text/plain; charset=UTF-8; name="=?UTF-8?B?dG9kby1kw6lqZXVuZXIudHh0?=" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*=UTF-8''%74%6F%64%6F%2D%64%C3%A9%6A%65%75%6E%65%72%2E%74%78%74
VmlhbmRlCk1lbnRoZQpQYWluClZpbgoKQ3Vpc2luZTogcHLDqXBhcmVyIGwnYXDDqXJvLCBj b3VwZXIgZXQgZmFpcmUgcmlzc29sZXIgbGVzIHBhdGF0ZXMsIGV0IGZhaXJlIGxlcyBjb29r aWVzCg==
att.get_filename() u'todo-d\xe9jeuner.txt'
** Changed in: mailman/2.1 Importance: Undecided => Medium ** Changed in: mailman/2.1 Status: New => Incomplete ** Changed in: mailman/2.1 Assignee: (unassigned) => Mark Sapiro (msapiro) -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
Sorry, I should have written it : it's with Mailman 3 HEAD. -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
** No longer affects: mailman/2.1 ** No longer affects: mailman/3.0 -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
** Branch linked: lp:~abompard/postorius/bug-1060951 ** Branch linked: lp:~abompard/mailman/bug-1060951 -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
** Changed in: mailman Milestone: None => 3.0.0b5 ** Changed in: mailman Assignee: (unassigned) => Barry Warsaw (barry) ** Changed in: mailman Importance: Undecided => High ** Changed in: mailman Status: New => Fix Committed -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
** Changed in: mailman Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1060951 Title: Bug getting i18n'ed attachment filenames (RFC2231) To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1060951/+subscriptions
participants (3)
-
Aurélien Bompard
-
Barry Warsaw
-
Mark Sapiro