
Sept. 11, 2002
1:23 p.m.
Guido van Rossum <guido@python.org> writes:
One thing to watch out for: I believe that the bit pattern that's encoded is not the bit pattern of the full unicode character, but 2**16 less. This allows one to encode 2**16 more characters, at the cost of some extra complexity.
Correct. That allows to encode a total of 17 planes in Unicode, a plane being 2**16 characters. Therefore, saying that Unicode is 20 bits is somewhat imprecise - its better to say that it is 21 bits. Regards, Martin