Custom structured types with objectify.PyType?
Hi folks, lxml.objectfy lets us define custom types by means of objectify.PyType and ObjectifiedDataElement sub classes, e.g. as in [0]. It's nice how they map to XSD and automatically convert between python type and XML representation. I understand how it works for scalar leaf nodes. But does objectify.PyType work for structured/nested types as well? Or is PyType the wrong tool and one should better head over to "Generating XML with custom classes" [1] and "Custom element class lookup" [2]? Or is it not possible at all? Simple imaginary code: ---snip--- from collections import namedtuple from lxml import etree, objectify # A structured python type MyStructuredThing = namedtuple('MyStructuredThing', 'a b') # some custom code and registration like with objectify.PyType here # ... root = objectify.Element("root") # magically take python type and construct tree + leaf elements automagically root.mystructuredthing = MyStructuredThing(1, 2) etree.tostring(root, pretty_print=True) ---snap--- Expected output: <root> <mystructuredthing> <a>1</a> <b>1</b> </mystructuredthing> <root> etree.fromstring should in turn deserialize this XML snippet, so that isinstance(root.mystructuredthing.pyval, MyStructuredThing) == True. Thanks Tobias [0] https://lxml.de/api/lxml.tests.test_objectify-pysrc.html#ObjectifyTestCase.t... [1] https://lxml.de/element_classes.html#generating-xml-with-custom-classes [2] https://lxml.de/element_classes.html#custom-element-class-lookup
hey, On 2/9/22 17:51, Tobias Deiminger wrote:
But does objectify.PyType work for structured/nested types as well?
lxml.objectify is fast but is limited. my library, spyne, could help. see: https://github.com/arskom/spyne/blob/master/examples/xml/polymorphism.py it's actually a soap implementation but quite modular so you can use only what you need. hope it helps. best, burak
Hi Tobias,
lxml.objectfy lets us define custom types by means of objectify.PyType and ObjectifiedDataElement sub classes, e.g. as in [0]. It's nice how they map to XSD and automatically convert between python type and XML representation.
I understand how it works for scalar leaf nodes. But does objectify.PyType work for structured/nested types as well? Or is PyType the wrong tool and one should better head over to "Generating XML with custom classes" [1] and "Custom element class lookup" [2]? Or is it not possible at all?
It's not really suitable for structured types since the __setattr__ mechanics and PyType registry/lookup mechanisms basically put a text representation of the assigned value plus a type annotation attribute into the "underlying" XML tree. See https://lxml.de/objectify.html#how-data-types-are-matched and https:// github.com/lxml/lxml/blob/ac829d561c0bf71fb8cc704305ffc18bd26c6abb/src/lxml/ objectify.pyx#L491 for most there's to know about this. That said your options depend on how the parsing-from-XML and setting-objects- in-python-then-serialize-to-XML behaviour should "mirror" for your use case.
Simple imaginary code:
---snip--- from collections import namedtuple from lxml import etree, objectify
# A structured python type MyStructuredThing = namedtuple('MyStructuredThing', 'a b')
# some custom code and registration like with objectify.PyType here # ...
root = objectify.Element("root") # magically take python type and construct tree + leaf elements automagically root.mystructuredthing = MyStructuredThing(1, 2) etree.tostring(root, pretty_print=True)
root = objectify.Element('root') root.x = (1, 2, 3) print(etree.tostring(root)) b'<root xmlns:py="http://codespeak.net/lxml/objectify/pytype" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http ://www.w3.org/2001/XMLSchema" py:pytype="TREE"><x py:pytype="int">1</x><x
E = objectify.E class MyElement(objectify.ObjectifiedElement): ... def __setattr__(self, name, value): ... if isinstance(value, Structured): ... value = E.structured(E.a(value.a), E.b(value.b)) ... objectify.ObjectifiedElement.__setattr__(self, name, value) ... root = MyElement('root') root.structured = Structured(a=1, b=2) print(etree.tostring(root)) b'<MyElement>root<structured xmlns:py="http://codespeak.net/lxml/objectify/
Since a namedtuple is still a tuple this would trigger special-cased sequence assignment (details here: https://github.com/lxml/lxml/blob/ ac829d561c0bf71fb8cc704305ffc18bd26c6abb/src/lxml/objectify.pyx#L474 ) py:pytype="int">2</x><x py:pytype="int">3</x></root> Of course, you could always override __setattr__ in a custom subclass and special-case your structured datatype: pytype" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><a py:pytype="int">1</a><b py:pytype="int">2</b></structured></MyElement>' That won't get you creation of Structured objects when parsing - you'd need custom element class lookup for such stuff, and basically an Element class (distinct from your namedtuple) that represents your structured datatype. Note how parsing from XML doesn't give you built-in Python datatypes but objectified representatives that behave (very much) like the built-ins. (See Advanced element class lookup here: https://lxml.de/objectify.html#how-data-types-are-matched) I'd probably forgo all this and simply use the glorious E-Factory to create structured data in assignments where needed:
root = objectify.Element('root') root.structured = E._(E.a(1), E.b(2)) print(etree.tostring(root)) b'<root xmlns:py="http://codespeak.net/lxml/objectify/pytype" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http:// www.w3.org/2001/XMLSchema" py:pytype="TREE"><structured><a py:pytype="int">1</ a><b py:pytype="int">2</b></structured></root>'
(You can even build a vocabulary if you wish, like some mini-DSL, see: https://lxml.de/tutorial.html#the-e-factory) I.e. create the structured ObjectifiedElement "from the outside", not implicitly in the assignment. Best, Holger
Hi Burak and Holger, thank you both a lot for the good pointers and explanations. It definitely helps, and I'll need some time to dig into. Am 09.02.2022 19:28 schrieb jholg@gmx.de:
Since a namedtuple is still a tuple this would trigger special-cased sequence assignment
namedtuple was contrived, in practice types it's more about nested classes.
I'd probably forgo all this and simply use the glorious E-Factory to create structured data in assignments where needed
Tried E and am quite happy. Just discovered it automatically ignores None arguments - that's very useful when python types have lot's of optional attributes. Now I can simply write something like this element = E.structured( E.a(mytype.a) if mytype.a else None, E.b(mytype.b) if mytype.b else None, ) One new question: Is it possible to enforce an XSD type when using E-Factories? Like in this again imaginary mini-DSL approach ITEM_DATE = E(xsi_type="dateTime").item_date ITEM_DATE(datetime.datetime.now()) # works, uses previously registered custom objectify.PyType without guessing ITEM_DATE("foo") # raises Exception because value can't be converted to xsd:dateTime Cheers Tobias
participants (3)
-
Burak Arslan -
jholg@gmx.de -
Tobias Deiminger