Xpath extension functions and return values
data:image/s3,"s3://crabby-images/daa5d/daa5d257007d3e894f1c005fd0af8b880ca4e368" alt=""
Hi all I am starting to use Xpath Extension Functions. They work, and they are great, but the results are not quite what I was expecting. If the return value is a string, it returns a <class 'lxml.etree._ElementUnicodeResult'>, which I assume is a smart string. I am happy with this, and I do realise that I can disable this and get back a normal string if desired. If the return value is an integer, it returns a float. Not great, but I can live with that. If the return value is None, it returns an empty list, which seems odd. Just curious if there is an underlying reason for the last two. What follows is the code I used to produce these results. Thanks for any insight. Frank Millman P.S. I just experimented with wrapping the return value with json.dumps(), and unwrapping it with json.loads() on return. It works, and I get the results I expect, provided I disable smart strings. So this is an option if I felt strongly about it – not sure yet.
from lxml import etree
D = dict() D['a'] = 'hallo' D['b'] = 99 D['c'] = None
def get_val(context, key): ... return D[key] ... ns = etree.FunctionNamespace(None) ns['get_val'] = get_val
xml = etree.fromstring('<w><x>a</x><y>b</y><z>c</z></w>')
ans = xml.xpath('get_val(string(x))') print(ans) hallo print(type(ans)) <class 'lxml.etree._ElementUnicodeResult'>
ans = xml.xpath('get_val(string(y))') print(ans) 99.0 print(type(ans)) <class 'float'>
ans = xml.xpath('get_val(string(z))') print(ans) [] print(type(ans)) <class 'list'>
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Frank Millman schrieb am 23.03.2017 um 11:48:
I am starting to use Xpath Extension Functions. They work, and they are great, but the results are not quite what I was expecting.
If the return value is a string, it returns a <class 'lxml.etree._ElementUnicodeResult'>, which I assume is a smart string. I am happy with this, and I do realise that I can disable this and get back a normal string if desired.
If the return value is an integer, it returns a float. Not great, but I can live with that.
That's because XPath 1.0 actually defines numbers as floating point numbers. https://www.w3.org/TR/xpath/#numbers Obviously, lxml could check if a result value is exactly an integer and return it as an int in that case, but that would a) counter the definition in XPath, b) be an unsafe assumption in case the user code really meant to deal with floating point values, and c) suggest an incorrect accuracy for large numbers, e.g. >>> 3**50 717897987691852588770249 >>> int(3.0**50) 717897987691852578422784 So, while this might seem handy from an API perspective in many cases, it has its drawbacks.
If the return value is None, it returns an empty list, which seems odd.
I'm actually not sure right now if anything can be done about this, but lxml represents None as an empty node set in XPath, and then converts that to an empty Python list on the way out.
I just experimented with wrapping the return value with json.dumps(), and unwrapping it with json.loads() on return. It works, and I get the results I expect, provided I disable smart strings. So this is an option if I felt strongly about it – not sure yet.
That obviously works in this specific case, because it only ever passes strings. But it makes it impossible to use the values in any meaningful way inside of XPath and the necessary conversions can also slow down the overall processing considerably. Stefan
data:image/s3,"s3://crabby-images/daa5d/daa5d257007d3e894f1c005fd0af8b880ca4e368" alt=""
From: Stefan Behnel Sent: Friday, March 24, 2017 8:57 AM To: lxml@lxml.de Subject: Re: [lxml] Xpath extension functions and return values Frank Millman schrieb am 23.03.2017 um 11:48:
I just experimented with wrapping the return value with json.dumps(), and unwrapping it with json.loads() on return. It works, and I get the results I expect, provided I disable smart strings. So this is an option if I felt strongly about it – not sure yet.
That obviously works in this specific case, because it only ever passes strings. But it makes it impossible to use the values in any meaningful way inside of XPath and the necessary conversions can also slow down the overall processing considerably.
Valuable info, thanks. My current usage is very simplistic, but clearly there is a wider context that I was unaware of. I will proceed with caution :-) Frank
participants (2)
-
Frank Millman
-
Stefan Behnel