[XML-SIG] Unwanted behavior in PrettyPrint: > doesn't round-trip

Mike Brown mike at skew.org
Tue Jul 6 21:16:47 CEST 2004


rmunn at pobox.com wrote:
> <?xml version='1.0' encoding='UTF-8'?>
> <description>This contains a nested &lt;b> tag</description>
> >>> 
> 
> I'd prefer the output to be:
> """<?xml version='1.0' encoding='UTF-8'?>
> <description>This contains a nested &lt;b&gt; tag</description>
> """
> 
> This XML data is eventually going to be going into an HTML page and sent
> to the user's browser. Since the > character doesn't close any tags,
> most browsers will probably display it. But with the vast number of
> different browsers out there, with slightly different behavior, I'd
> rather not rely on "probably". :-( I'd prefer for the &gt; entity to
> make it through a round trip (parse to print) untouched.

There are no browsers that will have a problem with an unescaped ">".

This is one of those situations where paranoia about web browser behavior is 
not supported by reality, much like when people freak out about putting 
"&amp;" in an href.

> Is there any way for me to tell PrettyPrint not to dereference character
> entities?

Dereferencing occurs during parsing. What you want is to be able to customize
the serialization behavior.

Runtime modifications to xml.dom.ext.Printer.g_charToEntity don't seem to have 
any effect, so I'd say no, it's not possible. Don't worry about it, IMHO.


More information about the XML-SIG mailing list