nadia.helen.johnson at gmail.com
Tue Aug 25 09:12:57 CEST 2009
On Aug 24, 7:29 pm, Dave Angel <da... at ieee.org> wrote:
> Stefan Behnel wrote:
> > Hi,
> > elsa wrote:
> >> I know how to turn HTML into an ElementTree object
> > I don't. ;)
> > ElementTree doesn't have an HTML parser, so what do you use for parsing?
> >> but I don't know
> >> how to then view the structure of this object. Is there a method or
> >> module that you can give an ElementTree object to, and it returns some
> >> kind of graphical or printed representation of the tree? Otherwise, if
> >> you can't see you're tree's structure, how do you know what is a
> >> sensible way of iterating over the tree to access the info you need?
> > ElementTree has a tostring() method that returns a string. To get a pretty
> > printed representation, you can use the indent() function from this recipe:
> > Stefan
> Perhaps the OP was referring to XHTML, which should be eligible for
> ElementTree. But could you tell me whether ElementTree is at all
> tolerant of malformed XML? Most HTML and XHTML I encounter in the wild
> is so buggy it's amazing it all works at all.
I used elementtidy, also available from effbot
More information about the Python-list