Unicode characters in btye-strings
Martin v. Loewis
martin at v.loewis.de
Fri Mar 12 18:51:42 EST 2010
Michael Rudolf wrote:
> Am 12.03.2010 21:56, schrieb Martin v. Loewis:
>> (*) If a source encoding was given, the source is actually recoded to
>> UTF-8, parsed, and then re-encoded back into the original encoding.
>
> Why is that?
Why is what? That string literals get reencoded into the source encoding?
> So "unicode"-strings (as in u"string") are not really
> unicode-, but utf8-strings?
No. String literals, in 2.x, are not written with u"", and are stored in
the source encoding. Above procedure applies to regular strings (see
where the "*" goes in my original article).
> Need citation plz.
You really want a link to the source code implementing that?
Regards,
Martin
More information about the Python-list
mailing list