i18n pot file
anabell
anabell at sh163a.sta.net.cn
Sun Nov 2 01:37:03 EST 2003
I was unable to use charset utf-8 because I get this error message when I
try to run my localized application:
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1: invalid
data
Finding an alternative, I tried writing a gb2312 codec, which is not
availabe initially with Python's package. I downloaded a gb2312 character
map and put it in the gb2312.py codec file. It worked well (Simplified
Chinese characters are displayed).
I wonder though how I can make utf-8 work if it does support any language.
I looked into my python23/lib/encodings/ and found there exist utf-8. I
edited my .po file to charset utf_8, and generated its .mo file. But when I
ran my localized application, python's gettext module can recognize the
charset 'utf8', but problem occurs when it starts decoding the .mo file.
I opened the utf_8.py codec file, and found no character map. I wonder if
it's using a wrong map?
> It depends. The best CHARSET to use is UTF-8
> but of course you have to enter UTF-8 data into
> the po file. You can write all languages in UTF-8.
> There are charsets specific to language (groups)
> which you may prefer. For e.g.
> Big5 for traditional chinese
> gb2312 for simplified chinese
> ISO-8859-15 for western European languages
>
> The ENCODING should be 8bit in all cases
>
> The handiest thing to do is to look at examples:
> http://www2.iro.umontreal.ca/~gnutra/registry.cgi?team=zh_CN
>
> Maybe you could even practice on my application :-)
> http://www2.iro.umontreal.ca/~gnutra/registry.cgi?domain=fslint
>
> Pádraig.
> > anabell wrote:
> > Hi, I'm trying to localize to Chinese language. In the pot
> file header,
> > there appears:
> >
> > "POT-Creation-Date: Thu Oct 16 17:07:14 2003\n"
> > "PO-Revision-Date: 2003-10-16 HO:MI+ZONE\n"
> > "Last-Translator: Anabell chan <achan at mail.design.com
> > <mailto:achan at mail.design.com>>\n"
> > "Language-Team: LANGUAGE <LL at li.org <mailto:LL at li.org>>\n"
> > "MIME-Version: 1.0\n"
> > "Content-Type: text/plain; charset=CHARSET\n"
> > "Content-Transfer-Encoding: ENCODING\n"
> > "Generated-By: pygettext.py 1.5\n"
> > What should i fill in the 'CHARSET' and 'ENCODING' ?
>
>
More information about the Python-list
mailing list