HTML / DOM

Martin v. Löwis martin at v.loewis.de
Sat Mar 29 04:38:37 EST 2003


"Bo M. Maryniuck" <b.maryniuk at forbis.lt> writes:

> Hello, all.
> 
> Can anybody drop me a real code how to work with DOM in _HTML_ which is even 
> not XHTML? I took a look over 4DOM but unfortunately documentation there is 
> too silly. :( Well, for example, I have a HTML string:
> 
> 	<p>Text here <a name="foo">bar</a></p>

It appears that 4DOM has problems with omitted opening tags. Try

<html><body><p>Text here <a name="foo">bar</a></p>

You might want to report this as a bug. With that change, you can write

d=HtmlLib.FromHtml('<html><body><p>Text here <a name="foo">bar</a></p>')

> Now, how to build a DOM from this chunk to do the following:
> 	1. Fetch somehow a "name" attribute from the "<A/>" tag

To find the A tag, do

elem = d.getElementsByTagName("a")[0] #arbitrarily take the first A

To read the attribute, do

elem.getAttributeNS(None,'a')

You might want to report it as a bug that you have to use
getAttributeNS. 

> 	2. Change it (not a "bar", but a "foo" value!)

To modify the attribute, use setAttributeNS.

> 	3. Push it back to the same place

That is not necessary if you use setAttributeNS.

> 	4. Return modified HTML back as string without doctype and so on.

Use PrettyPrint for that.

Regards,
Martin




More information about the Python-list mailing list