escaping illegal characters in XML

Mike Brown mike at
Mon Jan 13 04:39:26 CET 2003

"Martin v. Löwis" <martin at> wrote:
> > <?xml version="1.0" ?>
> > <section>
> > <item harvested="blah blah"/>
> > </section>
> Strictly speaking, this document *is* encoded in UTF-8: UTF-8 is the
> default if no encoding= attribute is given.

Strictly speaking, UTF-8 is assumed to be the encoding, regardless of what
the actual encoding is, when:
  - there's no externally-supplied encoding information, and
  - there's no encoding declaration in the prolog, and
  - there's no UTF-16 byte order mark at the beginning

More information about the Python-list mailing list