Behaviour of htmllib's HTML parser and formatter
Morten W. Petersen
morphex at gmail.com
Fri Mar 11 03:14:36 CET 2005
I have an HTML page that displays some content, and a part of that
content is HTML changed into regular text. The encoding of the page
Here's the code that makes the change (the HTML in self.contents is
file = cStringIO.StringIO()
parser = htmllib.HTMLParser(formatter.AbstractFormatter(
data = file.getvalue()[:size]
return return data
This renders entities such as as black diamonds with a ? sign
in them in Firefox, so I guess something is going wrong along the way.
Any suggestions what it might be?
More information about the Python-list