[lxml-dev] [PATCH] etree.fromstring and encoding
Hi list, I think ElementTree.fromstring expects a raw string as argument but the current lxml.etree.XML calls .encode("UTF-8") on it which is not nice if it's already an utf8 encoded string. Hence the patch. I'm not sure what should be expected though. If you guys have a clue :) I can write more tests with various encoded strings and unicode object as arg and try to compare ElementTree's behaviour with lxml's to better reproduce it in lxml. Just tell me if it's useful/required. Best, -- Olivier
Hey Olivier, Olivier Grisel wrote:
I think ElementTree.fromstring expects a raw string as argument but the current lxml.etree.XML calls .encode("UTF-8") on it which is not nice if it's already an utf8 encoded string. Hence the patch. I'm not sure what should be expected though.
Thanks for the patch! I think actually this was a simple bug, and UTF-8 would always be expected. I've stuck with your patch for now though, accepting unicode strings also; not sure whether I should remove this. Depends also on ElementTree's behavior.
I can write more tests with various encoded strings and unicode object as arg and try to compare ElementTree's behaviour with lxml's to better reproduce it in lxml. Just tell me if it's useful/required.
Such tests would still be appreciated! Regards, Martijn
participants (2)
-
Martijn Faassen
-
Olivier Grisel