
elem = etree.parse(io.StringIO('<root><node>text</node></root>')) E.b(elem.xpath('string(node)')) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/builder.py",
I'm writing an XML-transforming script with lxml, and I've run into two unexpected behaviors: First, ElementMaker doesn't accept the string results of xpath expressions: line 220, in __call__ raise TypeError("bad argument type: %r" % item) TypeError: bad argument type: 'text'
type(elem.xpath('string(node)')) <class 'lxml.etree._ElementUnicodeResult'>
etree.tostring(E.b(elem.xpath('node'))) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/builder.py",
This happens because https://github.com/lxml/lxml/blob/master/src/lxml/builder.py#L215 looks up the exact type of the argument in a dict, rather than doing something that can respect _ElementUnicodeResult's inheritance from str. Is this the right behavior for some reason I haven't thought of, or is it just an oversight? The workaround is straightforward—just pass the xpath result through str()—but it seems more verbose than should be necessary. Second, ElementMaker doesn't accept lists at all: line 220, in __call__ raise TypeError("bad argument type: %r" % item) TypeError: bad argument type: [<Element node at 0x106eb8d70>] Accepting lists would also be useful for building HTML documents as E.section("Header text", function_returning_element_sequence(), "Footer text"). The workarounds for this are straightforward too:
etree.tostring(E.b(elem.xpath('node')[0])) b'<b><node>text</node></b>' etree.tostring(E.b(*elem.xpath('node'))) b'<b/>'
... oops. Looks like ElementMaker re-parents nodes passed to it rather than copying them when they already have a parent. That's a third surprise, although it makes sense if ElementMaker was only intended for building entirely new documents, rather than copying bits from existing documents. This is with python-3.2.2, lxml-2.3.0, and libxml-2.7.8, Are these worth filing bugs/feature requests about? Thanks, Jeffrey