syntax from diveintopython
iddwb
iddwb at imap1.asu.edu
Tue Apr 17 13:12:29 EDT 2001
I've been going through the diveintopython stuff. Mostly excellent
material. however, there is syntax I don't understand was hoping someone
could enlighten me..
Here's the code i've been working with. I borrowed the code to create my
own HTMLProc class. The difficulties begin with the uknown_starttag
function. I keep getting a syntax error on the strattrs assignment. I
thought I copied the code verbatim but it still generates a syntax
error. Now, if I understood the syntax, I could fix it (probably). But I
am wondering if someone might explain the problem and the fix.
#!/usr/local/bin/python
# first test to open web pages using urlopen2
from sgmllib import SGMLParser
import sys
class HTMLProc(SGMLParser):
def reset(self):
# from diveintopython.org, extends SGMLParser
SGMLParser.reset(self)
self.parts = []
def unknown_starttag(self, tag, attrs):
strattrs = "".join([' %(key)s="%(value)s"' % locals() for key, value in attrs])
self.parts.append("<%(tag)s%(strattrs)s>" % locals())
def unknown_endtag(self, tag):
self.parts.append("</%(tag)s>" % locals())
def output(self):
return "".join(self.parts)
def do_body(fd):
try:
gmlbuffer = HTMLProc()
gmlbuffer.feed(fd.read())
fd.close()
gmlbuffer.close()
except AttributeError:
gmlbuffer.unknown_starttag("body", "bgcolor")
print "Attribute Error"
return -1
print "done with body"
return gmlbuffer
if __name__ == '__main__':
# print sys.argv[1:]
try:
f = open("dean.html")
except IOError:
print "couldn't open ", sys.argv[1:]
sys.exit(1)
htmlbuff = do_body(f)
print htmlbuff.parts
David Bear
College of Public Programs/ASU
More information about the Python-list
mailing list