> Dieter Maurer <dieter at handshake.de> writes:
> > XML processing systems (such as the parser "Expat") are only required
> > to support UTF-8 (and maybe UTF-16) encodings. All other
> > encodings are optional.
> > 
> > Apparently, "Expat" does not support "gb2312".
> Correct. pyexpat only supports single-byte encodings, and UTF-8. If
> you want to use other encodings with PyXML, you will have to use
> xmlproc.

FWIW, 4Suite's Domlettes extend Expat's built-in decoders by using the Python 
codec for unknown encodings.  This means that as long as you have a Python 
codec named "gb2312" installed, that your document should parse OK.

If you want to give it a try, see


for intros to Domlette.

