
M.-A. Lemburg writes:
Unicode has many encodings: Shift-JIS, Big-5, EBCDIC ... You can use 8-bit encodings of Unicode if you want.
This is meaningless: legacy encodings of national character sets such Shift-JIS, Big Five, GB2312, or TIS620 are not "encodings" of Unicode. TIS620 is a single-byte, 8-bit encoding: each character is represented by a single byte. The Japanese and Chinese encodings are multibyte, 8-bit, encodings. ISO-2022 is a multi-byte, 7-bit encoding for multiple character sets. Unicode has several possible encodings: UTF-8, UCS-2, UCS-4, UTF-16... You can view all of these as 8-bit encodings, if you like. Some are multibyte (such as UTF-8, where each character in Unicode is represented in 1 to 3 bytes) while others are fixed length, two or four bytes per character.
Um, if you go:
JIS -> Unicode -> JIS
you don't get the same thing out that you put in (at least this is what I've been told by a lot of Japanese developers), and therefore it's not terribly popular because of the nature of the Japanese (and Chinese) langauge.
This is simply not true any more. The ability to round trip between Unicode and legacy encodings is dependent on the software: being able to use code points in the PUA for this is acceptable and commonly done. The big advantage is in using Unicode as a pivot when transcoding between different CJK encodings. It is very difficult to map between, say, Shift JIS and GB2312, directly. However, Unicode provides a good go-between. It isn't a panacea: transcoding between legacy encodings like GB2312 and Big Five is still difficult: Unicode or not.
My experience with Unicode is that a lot of Western people think it's the answer to every problem asked, while most asian language people disagree vehemently. This says the problem isn't solved yet, even if people wish to deny it.
This is a shame: it is an indication that they don't understand the technology. Unicode is a tool: nothing more. -tree -- Tom Emerson Basis Technology Corp. Language Hacker http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever"