Mailman 3 Re: [Python-Dev] Allowing u.encode() to return non-strings - Python-Dev

29 Jun 2004

      [Bill Janssen]
...
Tim, do I understand then that Unicode strings have an implicit
character encoding, but non-Unicode strings do not?
An 8-bit string is a sequence of 8-bit bytes.  If those bytes are to
"mean something", you have to supply the meaning, or use them in a
context that supplies a specific meaning for you.  This seems nearly
impossible for an American to understand, but non-Americans appear to
know it at birth (if not earlier).

A Unicode string is, at least in theory, a sequence of Unicode
characters, the latter defined in excruciating detail by the Unicode
Consortium.  There's no conventional sense in which a Unicode string
is an encoding of something other than exactly itself, but you could
certainly make one up.

Re: [Python-Dev] Allowing u.encode() to return non-strings

Tim Peters

tags

participants (1)