[I18n-sig] ugettext charset

"Martin v. Löwis" martin at v.loewis.de
Sat Jun 26 20:37:36 CEST 2010


> I'm using Python 2.6.5 and gettext. Currently ugettext() and ungettext()
> doesn't respect 'codeset' setting

Of course not. It returns Unicode strings instead.

> and return only ASCII encoded strings.

I can't reproduce that. It certainly returns non-ASCII strings.

> Is it by design or is it a bug?

I think you misinterpret what you are seeing (although it's not really
clear what it is that you are seeing). AFAICT, the current behavior is
by design.

> This breaks some things, because, ASCII encoded unicode strings

This doesn't make sense. Unicode strings *cannot* be ASCII-encoded.
They are always Unicode-encoded - that's why they are called unicode
strings.

> are not
> considered equivalent to unicode strings in different encodings even if
> they contain exactly the same characters.

Unicode strings don't have different encodings. They are encoded in
Unicode.

> And unicode() function by
> default returns ASCII encoded strings. In this case it should get an
> argument for encoding.

The call to unicode only applies to the msgid, not the translation.
This should be safe, since the msgid will only contain ASCII characters.

Regards,
Martin


More information about the I18n-sig mailing list