Re: [Mailman-Developers] [ mailman-Patches-646884 ] HyperArch.py multibyte charset
Martin, Since you are on this list I respond to i18n.
Comment By: Martin v. L?wis (loewis) Date: 2002-12-07 17:10
Message: Logged In: YES user_id=21627
Tokio, can you please explain the problem in more detail? Utils.uquote will always return 7bit strings only, potentially with nested HTML character references. Why could this cause a problem in a multi-byte encoding? Can you refer to a page that looks incorrectly because of that?
Here is an example... Before: http://mm.tkikuchi.net/pipermail/mm-test/2002-December/000385.html and after the patch was applied: http://mm.tkikuchi.net/pipermail/mm-test/2002-December/000386.html You need japanese font installed to examine the difference. Utils.uquote() makes a multibyte character into two or more fragmants of Latin-1 characters. Utils.uquote() makes all the 8bit-set character escaped while Utils.uncanonstr() checks if the character is within the charset and escapes only if the chatacter is not legal. -- Tokio Kikuchi, tkikuchi@ is.kochi-u.ac.jp http://weather.is.kochi-u.ac.jp/
"TK" == Tokio Kikuchi <tkikuchi@is.kochi-u.ac.jp> writes:
TK> Utils.uquote() makes a multibyte character into two or more TK> fragmants of Latin-1 characters. TK> Utils.uquote() makes all the 8bit-set character escaped while TK> Utils.uncanonstr() checks if the character is within the TK> charset and escapes only if the chatacter is not legal. Thanks for the explanation. That seems good enough for me, so I've applied the patch. -Barry
participants (2)
-
barry@python.org
-
Tokio Kikuchi