Re: [Mailman-i18n] Re: [Mailman-Developers] [ mailman-Patches-646884 ] HyperArch.py multibyte charset
Here is an example... Before: http://mm.tkikuchi.net/pipermail/mm-test/2002-December/000385.html and after the patch was applied: http://mm.tkikuchi.net/pipermail/mm-test/2002-December/000386.html
You need japanese font installed to examine the difference.
Utils.uquote() makes a multibyte character into two or more fragmants of Latin-1 characters.
I see. That shows that the bug is actually elsewhere: Utils.uquote is being passed a byte string. This is not supposed to happen, as Utils.uquote only works correctly on Unicode strings. While the patch is still correct, it only papers over the problems: the output will now be correct only if the message encoding is equal to the list's preferred encoding, since Util.unquote will still receive a byte string. In turn, it will see whether the byte string happens to decode correctly in the list's preferred encoding (which may or may not succeed by coincidence). If decoding succeeds, it will insert the byte string unmodified into the page; if it fails, it will fall back to uquote. I think the problem really comes from some encodings ignoring the Unicode facilities in Mailman, and being carried through the processing chain. This should be done either correctly (by always accompanying the byte string with its encoding), or not at all (by converting everything to Unicode). This is perhaps a little to much asked for Mailman 2.1, though. Regards, Martin =================================================================== EASY and FREE access to your email anywhere: http://Mailreader.com/ ===================================================================
"MvL" == Martin v Löwis <loewis@informatik.hu-berlin.de> writes:
MvL> I think the problem really comes from some encodings ignoring MvL> the Unicode facilities in Mailman, and being carried through MvL> the processing chain. This should be done either correctly MvL> (by always accompanying the byte string with its encoding), MvL> or not at all (by converting everything to Unicode). This is MvL> perhaps a little to much asked for Mailman 2.1, though. At this late date, yes, probably so. I suspect that for the next version we're going to do something like what Zope 3 is doing, and mandate that all user visible strings be Unicode throughout. -Barry
participants (2)
-
barry@python.org
-
Martin v. L�wis