sgmllib.SGMLParseError: unexpected ':' char in declaration
Alessio Pace
puccio_13 at yahoo.it
Sun Feb 23 08:25:36 EST 2003
Carl Banks wrote:
> Alessio Pace wrote:
>> I can't figure out how to solve this raised error in my application, I'm
>> trying to use the htmllib.HTMLParser, it usually works fine but in some
>> cases(I don't know why, I am processing hundreds of html texts, I will
>> debug eventually case per case later..) it raises this:
>>
>> Traceback (most recent call last):
>> [......... ]
>> File "Html2Txt.py", line 45, in convertToTxt # my class
>> Html2Txt
>> parser.close() # close the htmllib.HTMLParser
>> File "/usr/lib/python2.2/sgmllib.py", line 99, in close
>> self.goahead(1)
>> File "/usr/lib/python2.2/sgmllib.py", line 161, in goahead
>> k = self.parse_declaration(i)
>> File "/usr/lib/python2.2/markupbase.py", line 96, in parse_declaration
>> self.error(
>> File "/usr/lib/python2.2/sgmllib.py", line 102, in error
>> raise SGMLParseError(message)
>> sgmllib.SGMLParseError: unexpected ':' char in declaration
>>
>> Thanks if some one can help me, I am a newbie of python.
>
>
> I'm guessing there's a comment in your HTML files that is spelled like
> this:
>
> <! -- blah blah blah : blah blah blah -- >
>
> I'm not an expert in SGML, but I do know that it has an oft
> misunderstood definition of a comment. I think the above is a valid
> comment, but SGMLlib (of course) doesn't parse it right, resulting in
> your error.
>
> The other possibility is your file has one of those silly <!DOCTYPE
> "blah blah blah"> things at the top that no one knows what the hell it
> is. Maybe it erroneously (in the opinion of SGMLlib) has a colon in
> it.
>
>
Thanks, now I'll check in the html sources.
More information about the Python-list
mailing list