DOCTYPE + SAX
Alain Ketterlin
alain at dpt-info.u-strasbg.fr
Sat Apr 9 11:47:15 EDT 2011
jdownie <jdownie at gmail.com> writes:
> I'm trying to get xml.sax to interpret a file that begins with…
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://
> www.w3.org/TR/html4/loose.dtd">
>
> After a while I get...
>
> http://www.w3.org/TR/html4/loose.dtd:31:2: error in processing
> external entity reference
>
> …although…
>
> time curl http://www.w3.org/TR/html4/loose.dtd
> [works]
You're mistaken. There is no problem fetching the file, but there is a
problem while parsing the file (at line 31, where you find a comment in
an entity declaration, which is not acceptable in XML).
You're trying to use HTML's SGML DTD in a XML document. Direct your
doctype to XHTML's DTD, and everything will be fine (hopefully).
BTW, your installation will probably let you use a locally cached copy
of the DTD, instead of fetching a file at every parse. How this works
depends somehow on the parser you use.
-- Alain.
More information about the Python-list
mailing list