[XML-SIG] HTML parsing on Python 2.1

Hans verschooten hansv@net4all.be
Wed, 23 May 2001 09:44:20 +0200


> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--MS_Mac_OE_3073455860_75874_MIME_Part
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit

Hi,

I am using a freshly installed MacPython 2.1 and would like to know what I
should install extra to use the following script:

[uogbuji@borgia one-offs]$ cat html-to-xhtml-converter.py
import sys
from xml.dom.ext.reader import HtmlLib
import xml.dom.ext

#set up a re-usable reader object
reader = HtmlLib.Reader()

#parse HTML ffrom file or URI given on command line.  Return the DOM
document
doc = reader.fromUri(sys.argv[1])

#Just for kicks, write it out as XHTML, i.e. all lowercase, XML syntax for
empty tags, all attributes with given value, etc.

xml.dom.ext.XHtmlPrettyPrint(doc)

If anybody could point me in the right direction, If tried installing PyXML
but keep getting end-of line errors. After trying to correct these I keep
running into errors like, ReleaseNode not found; HtmlLib has no module named
Reader.

Any help as to how and what should be installed on MacPython 2.1 would be
greatly appreciated.

Hans



--MS_Mac_OE_3073455860_75874_MIME_Part
Content-type: text/html; charset="US-ASCII"
Content-transfer-encoding: quoted-printable

<HTML>
<HEAD>
<TITLE>HTML parsing on Python 2.1</TITLE>
</HEAD>
<BODY>
Hi,<BR>
<BR>
I am using a freshly installed MacPython 2.1 and would like to know what I =
should install extra to use the following script:<BR>
<FONT SIZE=3D"4"><FONT FACE=3D"Courier New"><BR>
[<FONT COLOR=3D"#0000FF"><U>uogbuji@borgia</U></FONT> one-offs]$ cat html-to-=
xhtml-converter.py <BR>
import sys<BR>
from xml.dom.ext.reader import HtmlLib<BR>
import xml.dom.ext<BR>
<BR>
#set up a re-usable reader object<BR>
reader =3D HtmlLib.Reader()<BR>
<BR>
#parse HTML ffrom file or URI given on command line. &nbsp;Return the DOM d=
ocument<BR>
doc =3D reader.fromUri(sys.argv[1])<BR>
<BR>
#Just for kicks, write it out as XHTML, i.e. all lowercase, XML syntax for =
<BR>
empty tags, all attributes with given value, etc.<BR>
<BR>
xml.dom.ext.XHtmlPrettyPrint(doc)<BR>
<BR>
If anybody could point me in the right direction, If tried installing PyXML=
 but keep getting end-of line errors. After trying to correct these I keep r=
unning into errors like, ReleaseNode not found; HtmlLib has no module named =
Reader.<BR>
<BR>
Any help as to how and what should be installed on MacPython 2.1 would be g=
reatly appreciated.<BR>
<BR>
Hans<BR>
<BR>
</FONT></FONT>
</BODY>
</HTML>


--MS_Mac_OE_3073455860_75874_MIME_Part--