data:image/s3,"s3://crabby-images/27f1f/27f1fe826ae02aa3d044df38a2b4c5508e8d672a" alt=""
Hello, I need to calculate the SHA1 digest of an XML node. In order to do this correctly, I need to do C14N on the node. Specifically, http://www.w3.org/2001/10/xml-exc-c14n Here is a minimal working example: import io import copy from lxml import etree ORIGINAL_XML = b'''<soap:Envelope xmlns:ns="http://docs.oasis-open.org/ws-sx/ws-trust/200512" xmlns:soap="http://www.w3.org/2003/05/soap-envelope"> <soap:Header> <wsse:Security xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1...." xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"> <wsu:Timestamp wsu:Id="TS-1a8236fc-8e3a-9b71-495f-20f52709e893"> <wsu:Created>2017-09-19T07:38:45Z</wsu:Created><wsu:Expires>2017-09-19T08:38:45Z</wsu:Expires> </wsu:Timestamp> </wsse:Security> </soap:Header> <soap:Body xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1..." wsu:Id="id-1ea12929-0b1a-54d4-08db-b049cee527b5"> <ns:RequestSecurityToken> <ns:RequestType>http://docs.oasis-open.org/ws-sx/ws-trust/200512/Issue</ns:RequestType> <ns:TokenType>http://docs.oasis-open.org/wss/oasis-wss-saml-token-profile-1.1#SAMLV2.0</ns:TokenType> </ns:RequestSecurityToken> </soap:Body> </soap:Envelope> '''.replace(b'\n', b'') PREFIXES = ["soap", "wsse", "wsu"] # This is known to be the "good" C14N version of the wsu:Timestamp node GOOD_C14N_XML = b'''<wsu:Timestamp xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1...." xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1..." wsu:Id="TS-1a8236fc-8e3a-9b71-495f-20f52709e893"><wsu:Created>2017-09-19T07:38:45Z</wsu:Created><wsu:Expires>2017-09-19T08:38:45Z</wsu:Expires></wsu:Timestamp>''' root_node = etree.fromstring(ORIGINAL_XML) test_node = root_node.find(".//{*}Timestamp") # This is the node to calculate digest for output = io.BytesIO() node = copy.copy(test_node) node.getroottree().write(output, method="c14n", inclusive_ns_prefixes=PREFIXES, exclusive=True, with_comments=False, pretty_print=False) output.seek(0) TEST_RESULT = output.read() assert TEST_RESULT == GOOD_C14N_XML The GOOD_C14N_XML contains the "good" canonical version of the Timestamp element. I say that it is "good" because for the ORIGINAL_XML, the C14N version created by a java program looks like this, and I must replicate this exact format. So by "good" I mean: this is the one that I need to replicate. In this example program, the assertion fails. The GOOD_C14N_XML contains all namespace declarations that were given in the PREFIXES, *in that specific order*. The TEST_RESULT only contains the wsu namespace. Questions: * Is it possible to achieve my goal with lxml? E.g. create a C14N format that matches the GOOD_C14N_XML in every bit. * I have noticed that the write method accepts pretty_print=True when write method is c14n. It has no effect. But shouldn't it throw an exception? (The same way it throws an exception when encoding is specified for c14n method.) Thanks, Laszlo
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Nagy László Zsolt schrieb am 22.09.2017 um 15:57:
This is your problem. You are copying a single node out of a document, which loses the unrelated namespace declarations of other nodes.
Instead of (deep-)copying and asking for the root-tree, just wrap the node in a new ElementTree() and call .write() on that.
output.seek(0) TEST_RESULT = output.read()
This is just "output.getvalue()" in complex.
Ah, yes, it's ignored. I'll make it raise a warning for now. Thanks. Stefan
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Nagy László Zsolt schrieb am 22.09.2017 um 15:57:
This is your problem. You are copying a single node out of a document, which loses the unrelated namespace declarations of other nodes.
Instead of (deep-)copying and asking for the root-tree, just wrap the node in a new ElementTree() and call .write() on that.
output.seek(0) TEST_RESULT = output.read()
This is just "output.getvalue()" in complex.
Ah, yes, it's ignored. I'll make it raise a warning for now. Thanks. Stefan
participants (2)
-
Nagy László Zsolt
-
Stefan Behnel