[Mailman-i18n] Subject lines in Archives
Martin v. Löwis
loewis@informatik.hu-berlin.de
31 Mar 2002 17:47:35 +0200
Ben Gertzfield <che@debian.org> writes:
> Martin> If
> Martin> conversion fails, HTML character references are
> Martin> emitted.
> I'm a bit confused. How exactly do you propose converting non-Latin
> encoded text to Latin? Since it cannot ever be converted, are you
> going to emit Unicode HTML character references?
Yes, that's what I said, and that's what it does.
> Also, what do you do to map charsets to Python Unicode codecs?
codecs.lookup (actually, just unicode(str, encoding)).
> They're not one-to-one; for example, ISO-2022-JP goes to
> japanese.iso-2022-jp.
That is actually a bug in the Japanese codecs package; it ought to
register a lookup function, instead of relying on the default lookup
function. If that bug is not fixed, modifying
codecs.encodings.aliases.aliases might be appropriate.
Regards,
Martin