writing Unicode objects to XML
Steven Taschuk
staschuk at telusplanet.net
Mon May 5 19:52:46 EDT 2003
Quoth Martin v. Löwis:
> Steven Taschuk <staschuk at telusplanet.net> writes:
[whether it is possible "in XML" to specify 'a' instead of 'a',
and to distinguish same]
> > A nit: whether this is true is a property of one's XML tools, not
> > a property of XML itself. It is easy to imagine XML writers with
> > all sorts of policies about character encoding. (See below.)
>
> Well, no. There is a notion of the "XML Information Set", see
[...]
> The information of a character information item does *not* indicate
> whether the character was encoding in its source encoding, or using as
> a character reference. "Not being part of the XML infoset" is really
> the same thing as "no way in XML".
Our disagreement is in this last sentence. XML is not just the
infoset; it is also a syntax by which the information in the
infoset is (de)serialized. And at that level, there is indeed a
way to specify and distinguish numeric entity references and
literal characters. For example, I can and sometimes do write XML
in vi, and specify what I want directly.
The infoset is an abstraction layer; but XML is octets too.
--
Steven Taschuk staschuk at telusplanet.net
"Telekinesis would be worth patenting." -- James Gleick
More information about the Python-list
mailing list