[Mailman-Developers] Re: handling multi-byte characters in templates

JasonR.Mastaler JasonR.Mastaler
Fri, 20 Sep 2002 14:29:30 -0600


Tokio Kikuchi <tkikuchi@is.kochi-u.ac.jp> writes:

> Therefore, japanese messages are best treated
> 1. use euc-jp within internal process of messages and patterns.
> 2. convert the message charset from iso-2022-jp to euc-jp, when it
>     first enter the processing pipeline.
> 3. convert again to iso-2022-jp when the message going out.

Thank-you for the thorough explanation.  I have a few more questions
about this.

Do you know what people use for Japanese character code conversion
these days in Python?  I see that Mailman seems to be converting the
templates from euc-jp to iso-2022-jp when sending out mail messages,
but can't figure out where this is being done in the code.

Also, I found the JapaneseCodecs package for Python.  The README says:

  "By using this package, Japanese characters can be treated as a
  character string instead of a byte sequence."

This makes it seem like if I used JapaneseCodecs, no conversion would
be necessary -- I just could store templates in iso-2022-jp and the
special characters like `%' wouldn't interfere.  Does this sound
right?

Thanks.

-- 
(http://tmda.net/)