HTML Parser chokes on WordHTML...
Steven Taschuk
staschuk at telusplanet.net
Sat May 3 00:05:46 EDT 2003
A couple oversights in my previous comments:
Quoth I:
> Quoth Harald Massa:
[...]
> Strictly speaking, anything inside <!-- --> is a comment and the
> parser should ignore it.
This is mostly true in XML, excepting <![CDATA[ ... ]]> sections
(which are, however, rarely used in HTML).
And as Andrew Clover pointed out, in SGML it is possible for
elements to be implicitly CDATA, by virtue of a declaration to
that effect in the DTD.
[...]
> > again, <![if !suportLists]> does not look great, but should be legal
> > HTMl - should'nt it?
>
> No: <![if ...]> isn't legal HTML, so HTMLParser quite properly
> rejects it. The <! is legal only for starting a DOCTYPE
> declaration (and inside a DTD, which is not usually present in an
> HTML document).
... and for starting <![CDATA[ ... ]]>, and, in SGML, <![INCLUDE[
... ]]> and <![EXCLUDE[ ... ]]>, and perhaps other things I've
forgotten about. But again, these are rarely used in HTML.
--
Steven Taschuk Aral: "Confusion to the enemy, boy."
staschuk at telusplanet.net Mark: "Turn-about is fair play, sir."
-- _Mirror Dance_, Lois McMaster Bujold
More information about the Python-list
mailing list