sgmllib.SGMLParseError: unexpected ':' char in declaration

Alessio Pace puccio_13 at yahoo.it
Sun Feb 23 08:25:36 EST 2003


Carl Banks wrote:

> Alessio Pace wrote:
>> I can't figure out how to solve this raised error in my application, I'm
>> trying to use the htmllib.HTMLParser, it usually works fine but in some
>> cases(I don't know why, I am processing hundreds of html texts, I will
>> debug eventually case per case later..) it raises this:
>> 
>> Traceback (most recent call last):
>> [......... ]
>>  File "Html2Txt.py", line 45, in convertToTxt          # my class
>>  Html2Txt
>>    parser.close()                  # close the htmllib.HTMLParser
>>  File "/usr/lib/python2.2/sgmllib.py", line 99, in close
>>    self.goahead(1)
>>  File "/usr/lib/python2.2/sgmllib.py", line 161, in goahead
>>    k = self.parse_declaration(i)
>>  File "/usr/lib/python2.2/markupbase.py", line 96, in parse_declaration
>>    self.error(
>>  File "/usr/lib/python2.2/sgmllib.py", line 102, in error
>>    raise SGMLParseError(message)
>> sgmllib.SGMLParseError: unexpected ':' char in declaration
>> 
>> Thanks if some one can help me, I am a newbie of python.
> 
> 
> I'm guessing there's a comment in your HTML files that is spelled like
> this:
> 
> <! --  blah blah blah : blah blah blah -- >
> 
> I'm not an expert in SGML, but I do know that it has an oft
> misunderstood definition of a comment.  I think the above is a valid
> comment, but SGMLlib (of course) doesn't parse it right, resulting in
> your error.
> 
> The other possibility is your file has one of those silly <!DOCTYPE
> "blah blah blah"> things at the top that no one knows what the hell it
> is.  Maybe it erroneously (in the opinion of SGMLlib) has a colon in
> it.
> 
> 

Thanks, now I'll check in the html sources.




More information about the Python-list mailing list