Mailman 3 [lxml-dev] Unicode behaviour of Element.text - lxml - The Python XML Toolkit

March 18, 2010

      I tried to figure out the unicode-behaviour of Element.text. The lxml 
documentation does mention how parsing unicode data and serializing to 
unicode works, but I can not find any information on how Element.text 
returns strings. From what I can see it appears that Element.text 
returns either a str or a unicode instance, depending on the presence of 
non-ASCII text. That behaviour feels inconsistent, and for unicode using 
applications it means that every use of Element.text has to be written 
as unicode(node.text), which is not very pretty. Would it be possible to 
add an option to make the text attribute always return a unicode instance?

Wichert.

[lxml-dev] Unicode behaviour of Element.text

Wichert Akkerman

Stefan Behnel

tags

participants (2)