HTML / DOM
Martin v. Löwis
martin at v.loewis.de
Sat Mar 29 10:38:37 CET 2003
"Bo M. Maryniuck" <b.maryniuk at forbis.lt> writes:
> Hello, all.
> Can anybody drop me a real code how to work with DOM in _HTML_ which is even
> not XHTML? I took a look over 4DOM but unfortunately documentation there is
> too silly. :( Well, for example, I have a HTML string:
> <p>Text here <a name="foo">bar</a></p>
It appears that 4DOM has problems with omitted opening tags. Try
<html><body><p>Text here <a name="foo">bar</a></p>
You might want to report this as a bug. With that change, you can write
d=HtmlLib.FromHtml('<html><body><p>Text here <a name="foo">bar</a></p>')
> Now, how to build a DOM from this chunk to do the following:
> 1. Fetch somehow a "name" attribute from the "<A/>" tag
To find the A tag, do
elem = d.getElementsByTagName("a") #arbitrarily take the first A
To read the attribute, do
You might want to report it as a bug that you have to use
> 2. Change it (not a "bar", but a "foo" value!)
To modify the attribute, use setAttributeNS.
> 3. Push it back to the same place
That is not necessary if you use setAttributeNS.
> 4. Return modified HTML back as string without doctype and so on.
Use PrettyPrint for that.
More information about the Python-list