[XML-SIG] dom.utils.FileReader & HtmlBuilder

Mon, 5 Apr 1999 13:18:17 -0400

Could we have FileReader.readHtml() ignore mismatched end tags by default?  At
the moment, there is no way to ignore them at all using FileReader.  One of the
problems with FileReader is that there aren't a lot of ways to customize it
without subclassing it.  Since it is made to be extremely simple to use, I
figure it should fix up mismatched end tags by default.

Is the fixup for the parser not being freed still required?  Has that been
fixed?

    def readHtml(self,stream,ignore_mismatched_end_tags=1):
        from xml.dom import html_builder
        b = html_builder.HtmlBuilder(ignore_mismatched_end_tags)
        b.feed(stream.read())
        b.close()
        doc = b.document
        # There was some bug that prevents the builder from
        # freeing itself (maybe it has already been fixed?).
        # The next two lines break its references to the DOM
        # tree so that it can be freed.
        b.document = None
        b.current_element = None
        return doc

Thanks,
Jeff