nonstandard XML character entities?

Paul Rubin http
Fri Apr 13 22:25:27 EDT 2007


I'm new to xml mongering so forgive me if there's an obvious
well-known answer to this.  It's not real obvious from the library
documentation I've looked at so far.  Basically I have to munch of a
bunch of xml files which contain character entities like ú 
which are apparently nonstandard.  They appear in w3.org tables but
xml.etree.cElementTree.ElementTree.parse barfs at them and xmllint
barfs at them.

Basically I want to know if there's a way to supply the regular parser
(preferably xml.etree but I guess I can switch to another one if
necessary) with some kind of entity table, and/or if the info is
supposed to be found in the DTD or someplace like that.  Right now I'm
ignoring the DTD and simply figuring out the doc structure by
eyeballing the xml files, maybe not a perfectly approved method but
it seems to be what most people do.

Thanks




More information about the Python-list mailing list