python & xml question

Martin v. Loewis martin at
Sat Aug 3 12:37:18 CEST 2002

"jano" <jnana4 at> writes:

> I am just getting started with using python for xml.  I have been reading
> 'Python & XML', but it doesn't explain the EntityResolver interface well
> enough for me to know how to use it.  I am using sax to parse an xml file,
> but the program halts when it gets to the first '—'

Do you have a DOCTYPE declaration in the documented? That might be the
easiest approach: add a DOCTYPE that declares mdash; the parser should
then replace it automatically.

In fact, when you have control over the XML file, it might be even
easier to write —, then no resolution is needed at all.

> How do I specify what that should resolve to? I thought that
> DeclHandler.internalEntityDecl(name, value) seemed like it might be what I
> am looking for, but I can't get that to work.

That won't work. The internalEntityDecl handler is invoked every time
an internal decl is found inside the document, such as

<!ENTITY mdash   CDATA "—" -- em dash, U+2014 ISOpub -->

(actually, it might not be invoked when the parser does not support
that event).

Instead, you need to provide an EntityResolver; the parser will then
invoke resolveEntity when it encounters an entity supposedly in the
external subset.

Notice that this didn't work until PyXML 0.8.


More information about the Python-list mailing list