[I18n-sig] ugettext charset

Sat Jun 26 20:37:36 CEST 2010

> I'm using Python 2.6.5 and gettext. Currently ugettext() and ungettext()
> doesn't respect 'codeset' setting

Of course not. It returns Unicode strings instead.

> and return only ASCII encoded strings.

I can't reproduce that. It certainly returns non-ASCII strings.

> Is it by design or is it a bug?

I think you misinterpret what you are seeing (although it's not really
clear what it is that you are seeing). AFAICT, the current behavior is
by design.

> This breaks some things, because, ASCII encoded unicode strings

This doesn't make sense. Unicode strings *cannot* be ASCII-encoded.
They are always Unicode-encoded - that's why they are called unicode
strings.

> are not
> considered equivalent to unicode strings in different encodings even if
> they contain exactly the same characters.

Unicode strings don't have different encodings. They are encoded in
Unicode.

> And unicode() function by
> default returns ASCII encoded strings. In this case it should get an
> argument for encoding.

The call to unicode only applies to the msgid, not the translation.
This should be safe, since the msgid will only contain ASCII characters.

Regards,
Martin