"Martin" == Martin v Löwis
writes:
Ben> This is again the fact that many charsets have different Ben> names as a Python Unicode codec. It looks like all Ben> "windows-foo" charsets need to be mapped to "cpfoo" for the Ben> Python Unicode codec. Martin> In Python 2.3, this has happened (atleast for those known Martin> to IANA). For mailman, it may be desirable to provide some Martin> of those mappings even in earlier Python versions; see Martin> http://sourceforge.net/tracker/?func=detail&aid=538185&group_id=103&atid=300103 Thanks for the patch, Martin. I think we will need something similar to this for the Korean Windows charsets, as in all the Korean spam I get: Content-Type: text/html; charset="ks_c_5601-1987" We will probably need some general fallback to replace completely unknown charsets with some safe US-ASCII text. Do you think you could add this? Say, something like "(text with unknown encoding)". Ben -- Brought to you by the letters N and E and the number 16. "Bill Gates is a talented evil man." Debian GNU/Linux maintainer of Gimp and Nethack -- http://www.debian.org/