elementtree and gbk encoding
Fredrik Lundh
fredrik at pythonware.com
Wed Mar 15 07:46:15 EST 2006
Diez B. Roggisch wrote:
> Interestingly enough, that has not to be the case. A document can very well
> be well-formed without a header. The constraints for well-formedness are
> scattered throughout the spec, so I'm not sure what they say about the used
> encoding in absence of a header.
if there's no header, and no external override, the document must use either
UTF-8 or UTF-16, and for UTF-16, a leading byte order mark must be present
(ASCII is of course a subset of UTF-8, but e.g. ISO-8859-1 isn't).
reading
http://www.w3.org/TR/2004/REC-xml-20040204/#sec-guessing
may also help (at least if you read between the lines).
> Boy, that XML-stuff is always full of surprises - even after so many years
> dealing with it..
a specification written for humans would have saved the world a lot of con-
fusion...
</F>
More information about the Python-list
mailing list