Python 3.3, gettext and Unicode problems

Marcel Rodrigues marcelgmr at
Mon Dec 31 01:39:21 CET 2012

I'm using Python 3.3 (CPython) and am having trouble getting the standard
gettext module to handle Unicode messages.
My problem can be isolated as follows:

I have 3 files in a folder:, greeting.po and

-- --
import gettext

t = gettext.translation("greeting", "locale", ["pt"])
_ = t.lgettext

print("_charset = {0}\n".format(t._charset))
-- EOF --

-- greeting.po --
msgid ""
msgstr ""
"Project-Id-Version: 1.0\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

msgid "hello"
msgstr "olá"
-- EOF -- was downloaded from, since
this tool apparently isn't included in the python3 package available on
Arch Linux official repositories.

It's probably also worth noting that the file greeting.po is encoded itself
as UTF-8.

>From that folder, I run the following commands:

$ mkdir -p locale/pt/LC_MESSAGES
$ python -o !$/ greeting.po
$ python

The output is:
_charset = UTF-8

Traceback (most recent call last):
  File "", line 7, in <module>
  File "/usr/lib/python3.3/", line 314, in lgettext
    return tmsg.encode(locale.getpreferredencoding())
UnicodeEncodeError: 'ascii' codec can't encode character '\xe1' in position
2: ordinal not in range(128)

My interpretation of this output is that even though gettext correctly
detects the MO file charset as UTF-8, it tries to encode the translated
message with the system's "preferred encoding", which happens to be ASCII.

Anyone know why this happens? Is this a bug on my code? Maybe I have
misunderstood gettext...


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Python-list mailing list