[Chicago] sgmlparser problem
Lukasz Szybalski
szybalski at gmail.com
Tue Dec 12 01:34:20 CET 2006
Yea that is one solution. It does work, but instead of skipping bad html
i am fixing it and then trashing it. It seems kind of odd.
Thank,
Lucas
David Terrell wrote:
> Bad HTML markup should probably go through BeautifulSoup, which
> tries to deal with this kind of awfulness.
>
> http://www.crummy.com/software/BeautifulSoup/
>
>
> On Sun, Dec 10, 2006 at 05:35:12PM -0600, Lukasz Szybalski wrote:
>
>> Hello,
>> Would you guys know how to bypass this error i'm getting from sgml parser.
>>
>> expected name token at '<! -- NEW PAGE -'
>>
>> Obviously the <! -- should be <!--
>>
>> How can i tell sgmlparser to move on and/or bypass not valid html.
>>
>>
>> File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
>> self.goahead(0)
>> File "/usr/lib/python2.4/sgmllib.py", line 165, in goahead
>> k = self.parse_declaration(i)
>> File "/usr/lib/python2.4/markupbase.py", line 95, in parse_declaration
>> decltype, j = self._scan_name(j, i)
>> File "/usr/lib/python2.4/markupbase.py", line 384, in _scan_name
>> self.error("expected name token at %r"
>> File "/usr/lib/python2.4/sgmllib.py", line 102, in error
>> raise SGMLParseError(message)
>> sgmllib.SGMLParseError: expected name token at '<! -- NEW PAGE -
>>
>> thanks
>> Lucas
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>
>
More information about the Chicago
mailing list