Is sgmllib.py 's BUG?
Sean 'Shaleh' Perry
shalehperry at home.com
Thu Oct 18 07:39:24 CEST 2001
On 18-Oct-2001 limodou wrote:
> Sometimes I use python to analyse a HTML document. But I found that if
> there is a tag start with '<!' not '<!--', sgmllib with treat it as a
> 'special' pattern. It'll be ok mostly, occasionaly failed. Because
> sometimes someone can use tag '<!' for comment. I fix it by treat all
> '<!' as comment, but this will lost declaration like DOCTYPE. Anyone
> has some ideas?
at the start:
special = re.compile('<![^<>]*>')
match = special.match(rawdata, i)
i = i+1
i = match.end(0)
so if you want to handle <!DOCTYPE> it needs to be in a data handler.
We have buried the putrid corpse of Liberty. -- Benito Mussolini
More information about the Python-list