[XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled correctly by "HtmlBuilder/HtmlWriter"

Jeff.Johnson@icn.siemens.com Jeff.Johnson@icn.siemens.com
Fri, 13 Aug 1999 12:46:02 -0400


I had similar problems a while back and came up with the following hack (nobody
seemed to think it was problem so I had to fix it myself)...  I have no idea if
this is a good fix but it seemed to fix most of my problems...

class MyHtmlBuilder(HtmlBuilder):
    def handle_charref(self, name):
     #print name
        try:
            n = string.atoi(name)
        except string.atoi_error:
            self.unknown_charref(name)
            return
        # JCJ 1999-06-11: This turns µ into chr(181) which when saved
        # back as HTML, is no good.
        #if not 0 <= n <= 255:
        if not 0 <= n <= 127:
            self.unknown_charref(name)
            return
        self.handle_data(chr(n))

    def unknown_charref(self, ref):
        #gLog.Warning('unknown_charref %s' % ref)
     Builder.entityref(self, '#' + ref)

    def unknown_entityref(self, ref):
        gLog.Error('unknown_entityref %s' % ref)





Dieter Maurer <dieter@handshake.de> on 08/12/99 01:06:22 PM

To:   xml-sig@python.org
cc:    (bcc: Jeff Johnson/Service/ICN)
Subject:  [XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled
      correctly by "HtmlBuilder/HtmlWriter"




"HtmlBuilder" translates '&amp;' into an entity reference.
This does not follow the DOM spec. It specifies that
character references are expected to be expanded by the
HTML/XML processor.

"XmlWriter/HtmlWriter" does not output the 'amp' entity reference.
This, obviously, is a bug in "XmlWriter/HtmlWriter".
By the way, processing instructions are not output, too.

I have fixed my "&amp;" problem by adding "amp" to the
"expand_entities" tuple in "HtmlBuilde". This, however,
is not a general solution.

- Dieter


_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://www.python.org/mailman/listinfo/xml-sig