> There's probably no good, complete answer that can be given in a short
> email post. Basically, there's supposed to be a standard encoding for
> unicode: UTF-8. However, go to google.cn for instance and you'll see that

If this isn't outright wrong, it's at least confusing.  AFAIK, there is no 
official standard encoding, though I'd be happy to be corrected.  UTF-8 has 
become the de facto standard, because it's the most comprehensive and sane 
without using an absurd number of bytes per character.  There are a number of 
other functionally similar encodings that aren't used all that much: UTF-7, 

> So we have to encode/decode because there is no standard encoding yet.
> That's why GB2312 and all those other bizarro encodings are packed into the
> Python standard library.

As for the need for other encodings, we've got 50 years of legacy documents 
that aren't going to magically transform themselves to UTF-8.

