Sean 'Shaleh' Perry shalehperry at
Thu Oct 18 07:39:24 CEST 2001

On 18-Oct-2001 limodou wrote:
> Sometimes I use python to analyse a HTML document. But I found that if
> there is a tag start with '<!' not '<!--', sgmllib with treat it as a
> 'special' pattern. It'll be ok mostly, occasionaly failed. Because
> sometimes someone can use tag '<!' for comment. I fix it by treat all
> '<!' as comment, but this will lost declaration like DOCTYPE. Anyone
> has some ideas?

at the start:
special = re.compile('<![^<>]*>')
then later:
match = special.match(rawdata, i)
if match:
    if self.literal:
        i = i+1
    i = match.end(0)

so if you want to handle <!DOCTYPE> it needs to be in a data handler.

