Hi, Aloys Baillet wrote:
I was planning on upgrading to a recent version of lxml but found that our code was failing in numerous places with None objects found in unexpected places. In lxml 2+ the findtext method will ignore the default and return None if the element is found but the text is empty.
Thanks, this change was introduced in ElementTree 1.3 and lxml 2.0. Note that _elementpath.py is mostly a copy of ElementPath.py in ET, except for some minor adaptations and Py3 fixes.
In elementtree and lxml before 2 the findtext method would never return None, if the element is found but empty it would return the default.
This is not true. ET 1.2 (and thus lxml <= 1.3) returned an empty string instead, which wasn't necessarily the default either. So, for ET 1.2 compatibility, it should return an empty string if the text is empty, and the 'default' value (which is None if not passed!) when the element is not found. I wonder why the default is None, though. If the function is supposed to avoid checks on user side by always returning a string value, the default should be the empty string as well. Plus, lxml.etree knows the difference between an empty string text value ('') and no text content (None). So this would blur things in one place while keeping them transparent in all others. Fredrik, do you have any comments on this? Stefan