Found a parsing bug in HTMLParser

Wojtek Walczak gminick at hacker.pl
Sun Feb 9 15:56:14 EST 2003


Dnia Sun, 9 Feb 2003 18:06:56 +0100, Grzegorz Adam Hankiewicz napisał(a):
> I've found a bug in HTMLParser parsing some of my webpages. The
The bug exists because of that line:
   
        <a href="http://ss"title="pe">P</a>

the place in code responsible for complaining that is a method
check_for_whole_start_tag() of class HTMLParser, lines 308 to 312:

            if next in ("abcdefghijklmnopqrstuvwxyz=/"
                        "ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
                # end of input in or before attribute value, or we have the
                # '/' from a '/>' ending
                return -1

I don't want to change this since I'm sure, I'll make HTMLParser weak for some
other conditions.

ps. I'll inform people on python-dev mailing list.

-- 
[ ] gminick (at) underground.org.pl  http://gminick.linuxsecurity.pl/ [ ]
[ "Po prostu lubie poranna samotnosc, bo wtedy kawa smakuje najlepiej." ]




More information about the Python-list mailing list