SGMLParser problem

Martin v. Loewis martin at v.loewis.de
Fri Nov 8 13:11:48 EST 2002


sanjay2kind at yahoo.com (sanjay) writes:

> Any one has suggestion for following problem. Some word documents
> have been converted to HTML page in Ms-Word. Want to filter html tags
> like..
> <o:p></o:p>,
> <![if !supportEmptyParas]> <![endif]>, etc. I couldn't solve
> using SGMLParser. Shows error like..

I recommend to use sgmlop for that, as distributed with PyXML.
Also, people use HTMLTidy for this specific task.

Regards,
Martin



More information about the Python-list mailing list