spencer.c wrote:
> I am using lxml to process some xhtml files.  The files have html character
> codes embedded in them.  For instance: ' rather than a '.  When I parse
> the files, edit them, and then write them back out, I want my edits to be
> the only changes in the output files, but lxml is replacing the character
> codes with the actual characters they are supposed to represent as well.
> So if I have:
> It& #39;s an example. <-- Space inserted to help readability.
> It is writing out:
> It's an example.  
> I've tried setting resolve_entities to false, ala:
> tree = etree.parse(input, etree.XMLParser(resolve_entities=False))
> But this seems to have no effect.
> There a way to tell lxml to ignore these/leave them as is?
> Thanks.
