[XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled correctly by "HtmlBuilder/HtmlWriter"
Fred L. Drake, Jr.
Fred L. Drake, Jr." <fdrake@acm.org
Fri, 13 Aug 1999 09:59:28 -0400 (EDT)
--Apu33M+PUU
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit
Dieter Maurer writes:
> "HtmlBuilder" translates '&' into an entity reference.
> This does not follow the DOM spec. It specifies that
> character references are expected to be expanded by the
> HTML/XML processor.
>
> "XmlWriter/HtmlWriter" does not output the 'amp' entity reference.
> This, obviously, is a bug in "XmlWriter/HtmlWriter".
No, but if & is present as data, it writes out &, so I think
that's OK.
> By the way, processing instructions are not output, too.
You you sure they're in your tree? What I see is that they are
output, but using the XML-style syntax: <?foo bar?> instead of
<?foo bar>.
I've checked in a fix that allows HtmlWriter to produce SGML-style
PIs. This *doesn't* do anything to change the handling of PIs as
(target, value) tuples; this was a concept introduced in some of the
XML APIs (not even XML itself as I understand it).
The patch to xml/dom/writer.py is attached; it also teaches the
*Lineariser classes to use cStringIO when available.
-Fred
--
Fred L. Drake, Jr. <fdrake@acm.org>
Corporation for National Research Initiatives
--Apu33M+PUU
Content-Type: text/plain
Content-Description: xml/dom/writer.py patch
Content-Disposition: inline;
filename="PATCH"
Content-Transfer-Encoding: 7bit
Index: writer.py
===================================================================
RCS file: /home/cvsroot/xml/dom/writer.py,v
retrieving revision 1.9
retrieving revision 1.10
diff -c -r1.9 -r1.10
*** writer.py 1999/04/28 02:42:19 1.9
--- writer.py 1999/08/13 13:50:18 1.10
***************
*** 124,131 ****
class XmlLineariser(XmlWriter):
def __init__(self):
! import StringIO
! self.buffer = StringIO.StringIO()
XmlWriter.__init__(self, self.buffer)
def linearise(self, node):
--- 124,134 ----
class XmlLineariser(XmlWriter):
def __init__(self):
! try:
! from cStringIO import StringIO
! except ImportError:
! from StringIO import StringIO
! self.buffer = StringIO()
XmlWriter.__init__(self, self.buffer)
def linearise(self, node):
***************
*** 169,180 ****
self._setNewLines(nl_dict)
class HtmlLineariser(HtmlWriter):
def __init__(self):
! import StringIO
! self.buffer = StringIO.StringIO()
HtmlWriter.__init__(self, self.buffer)
def linearise(self, node):
--- 172,192 ----
self._setNewLines(nl_dict)
+ def doOtherNode(self, node):
+ if node.get_nodeType() == PROCESSING_INSTRUCTION_NODE:
+ self.stream.write("<?%s %s>" % (node.target, node.value))
+ else:
+ XmlWriter.doOtherNode(self, node)
+
class HtmlLineariser(HtmlWriter):
def __init__(self):
! try:
! from cStringIO import StringIO
! except ImportError:
! from StringIO import StringIO
! self.buffer = StringIO()
HtmlWriter.__init__(self, self.buffer)
def linearise(self, node):
--Apu33M+PUU--