syntax from diveintopython

iddwb iddwb at imap1.asu.edu
Tue Apr 17 13:12:29 EDT 2001


I've been going through the diveintopython stuff.  Mostly excellent
material.  however, there is syntax I don't understand was hoping someone
could enlighten me..

Here's the code i've been working with.  I borrowed the code to create my
own HTMLProc class.  The difficulties begin with the uknown_starttag
function.  I keep getting a syntax error on the strattrs assignment.  I
thought I copied the code verbatim but it still generates a syntax
error.  Now, if I understood the syntax, I could fix it (probably).  But I
am wondering if someone might explain the problem and the fix.

#!/usr/local/bin/python
# first test to open web pages using urlopen2
from sgmllib import SGMLParser
import sys

class HTMLProc(SGMLParser):
        def reset(self):
	# from diveintopython.org, extends SGMLParser
	     SGMLParser.reset(self)
	     self.parts = []

	def unknown_starttag(self, tag, attrs):
		strattrs = "".join([' %(key)s="%(value)s"' % locals() for key, value in attrs])
		self.parts.append("<%(tag)s%(strattrs)s>" % locals())
		
	def unknown_endtag(self, tag):
		self.parts.append("</%(tag)s>" % locals())

	def output(self):
		return "".join(self.parts)
	     
def do_body(fd):
	try:
		gmlbuffer = HTMLProc()
		gmlbuffer.feed(fd.read())
		fd.close()
		gmlbuffer.close()
	except AttributeError:
                gmlbuffer.unknown_starttag("body", "bgcolor")
		print "Attribute Error"
                return -1
	print "done with body"
	return gmlbuffer

if __name__ == '__main__':
#	print sys.argv[1:]
	try:
		f = open("dean.html")
	except IOError:
		print "couldn't open ", sys.argv[1:]
		sys.exit(1)
	htmlbuff = do_body(f)
	print htmlbuff.parts

David Bear
College of Public Programs/ASU




More information about the Python-list mailing list