[lxml-dev] Encoding again
Hi again! I'm unable to solve encoding problems myself, so I ask again here, hope someone have solution. Is there any way to force lxml to make element.text and element.tail to be exactly the same as in original text, without any encoding manipulation? Or to restore them to original state, i.e. maybe somewhere inside lxml there is a var which contain original encoding, so I could do elelemt.text.encode('...').?
Hi, Max Ivanov wrote:
Is there any way to force lxml to make element.text and element.tail to be exactly the same as in original text, without any encoding manipulation? Or to restore them to original state, i.e. maybe somewhere inside lxml there is a var which contain original encoding, so I could do elelemt.text.encode('...').?
I'm not sure I understand what you want, but in case you want lxml.etree to return the encoded byte string instead of the unicode string: no, there is no switch to do that. I have no idea why you would want to do that, though. The original encoding is stored in the docinfo property of the ElementTree of the document. Stefan
participants (2)
-
Max Ivanov
-
Stefan Behnel