
June 29, 2004
12:53 a.m.
[Bill Janssen]
Tim, do I understand then that Unicode strings have an implicit character encoding, but non-Unicode strings do not?
An 8-bit string is a sequence of 8-bit bytes. If those bytes are to "mean something", you have to supply the meaning, or use them in a context that supplies a specific meaning for you. This seems nearly impossible for an American to understand, but non-Americans appear to know it at birth (if not earlier). A Unicode string is, at least in theory, a sequence of Unicode characters, the latter defined in excruciating detail by the Unicode Consortium. There's no conventional sense in which a Unicode string is an encoding of something other than exactly itself, but you could certainly make one up.