Unicode characters in btye-strings

Martin v. Loewis martin at v.loewis.de
Sat Mar 13 00:51:42 CET 2010


Michael Rudolf wrote:
> Am 12.03.2010 21:56, schrieb Martin v. Loewis:
>> (*) If a source encoding was given, the source is actually recoded to
>> UTF-8, parsed, and then re-encoded back into the original encoding.
> 
> Why is that?

Why is what? That string literals get reencoded into the source encoding?

> So "unicode"-strings (as in u"string") are not really 
> unicode-, but utf8-strings?

No. String literals, in 2.x, are not written with u"", and are stored in
the source encoding. Above procedure applies to regular strings (see
where the "*" goes in my original article).

> Need citation plz.

You really want a link to the source code implementing that?

Regards,
Martin



More information about the Python-list mailing list