HTML -> XML: Where's HtmlBuilder()?

F. GEIGER fgeiger at datec.at
Mon Mar 11 03:22:41 EST 2002


> If you have control over the source HTML files, why don't you just
> require that they are XHTML? That would simplify processing
> significantly.

I am working on that, as there already many HTML files already. But as it's
not that bad to mistrust HTML-editors, I'd like to have a kind of built-in
bastion which makes *always sure* that it's XHTML.

Thanks for your answer
Franz


"Martin v. Loewis" <martin at v.loewis.de> schrieb im Newsbeitrag
news:m3d6ycrzk6.fsf at mira.informatik.hu-berlin.de...
> "F. GEIGER" <fgeiger at datec.at> writes:
>
> > Is HtmlBuilder() deprecated?
>
> It is not supported in PyXML 0.6 and later anymore, since the entire
> DOM implementation has been replaced with 4DOM.
>
> > Which module was it replaced with?
>
> Try xml.dom.ext.reader.HtmlLib.
>
> > What else could I use to convert HTML to XML?
>
> Depends on what you want the conversion to do. If you want to convert
> HTML to XHTML, I think HTMLTidy can do that for you.
>
> > Do I need additional modules for this?
> > Or am I already prepared w/o knowing it?
>
> If you want to use 4DOM, yes, you should have everything you need.
>
> > For those who are curious - the big picture:
> [...]
> > A Shaper applies an XSL file to the HTML pages to make them all look
equally
> > formatted (this is where I am stuck now).
>
> If you have control over the source HTML files, why don't you just
> require that they are XHTML? That would simplify processing
> significantly.
>
> Regards,
> Martin





More information about the Python-list mailing list