lxml.objectify2: More fun, namespaces, pythonic
Dear XLML Users! I am developing lxml.objectify2(lxml.o2). Lxml.o2 has tree objectives: * making lxml more pythonic * introducing robust namespaced properties * making lxml more fun *Following the old ways** * Imagine the following xml file. xml_str = '''\ <obj:root xmlns:obj="objectified" xmlns:other="otherNS"> <obj:c1 a1="A1" a2="A2" other:a3="A3"> <obj:c2>0</obj:c2> <obj:c2>1</obj:c2> <obj:c2>2</obj:c2> </obj:c1> <obj:c1> <other:c2>3</other:c2> <other:c2>5</other:c2> <obj:c2>2</obj:c2> </obj:c1> <obj:c1> <other:c2>42</other:c2> </obj:c1> </obj:root>''' Please notice that the tags obj:c1 and obj/other:c2 are multiple childs of the same {ns}name. Here a glance at the data processed by xlml.o (standard lxml.objectfy) from the PyCharm IDE perspective. https://backend.datenadler.de/kram/bildschirmfoto-vom-2022-03-07-23-02-33.pn... You may notice that there is no multiplicity at all. lxml.o is quite limited and not really pythonic. Therefore any Python-IDE will struggles with a representation of lxml processed data. *Following the new ways* Let's use lxml.objectify2 instead. from lxml.objectify2 import ObjectifiedElement2 obj2_lookup = ObjectifyElementClassLookup(tree_class=ObjectifiedElement2) parser = etree.XMLParser() parser.set_element_class_lookup(obj2_lookup) node = etree.XML(xml_str, parser=parser) A look from the PyCharm debugger into the data structure processed by lxml.o2: https://backend.datenadler.de/kram/bildschirmfoto-vom-2022-03-07-22-34-10.pn... As you can see lxml.o2 handles multiple children with same qtag by assigning an "[index]" to them. *<rant>Yeah, that is nice screenwork, but this will never work in code?**</rant>*
node.obj_c1[2].obj_c2 [3]
here the call to node.obj_c1 returns a list. Then python takes over get the desired second element. *<rant>Ok, but this will not work with getattr**</rant>*
getattr(node, 'obj_c1[0]').obj_c2
[0, 1, 2] Here lxml.o2 does the selection of the element [0] really fast in c-space. ** *<rant>OK, and where is the catch**</rant>* To implement this functionality we need to ensure that two rules are followed by the user. 1) If there are elements without a namespace, a default namespace has to be defined. 2) Any access to a "tag" has to be done qualified, with the exception of the default namespace. node.<namespace>_<name> mit default namespace node.<name> If these rules a too much for you, go back to lxml.objectify and be happy. *<rant>Ah, go away. Where do you find such nice XML**</rant>* Mh. I have never seen so simple XML documents like in the lxml.objectify tests in the real world. But I am aware that lxml.o2 will have to be tested thoroughly. * * *<rant>You will never convince all the users of lxml to change to lxml.o2**</rant>* That is true. But I do not even try. lxml.o2 is an alternative to lxml.o for certain usecases. You are welcome to rant at me :-) You are also welcome to help with the development of lxml.o2. This is a spare time job for me. If you do not have the time to help, you may express your liking of lxml.o2, here. lxml.o2 lives at https://github.com/Inqbus/lxml <https://github.com/Inqbus/lxml> in the branch https://github.com/Inqbus/lxml/tree/objectify_prefix Cheers, Volker -- ========================================================= inqbus Scientific Computing Dr. Volker Jaenisch Hungerbichlweg 3 +49 (8860) 9222 7 92 86977 Burggenhttps://inqbus.de =========================================================
Dear LXML People! I learned that some members of the list were annoyed by my posts. My aim was never to disrespect any person nor their work for lxml. I you feel disrespected I am truely sorry for that. But I am still sure that lxml.objectify is not the perfect solution. lxml.objectify has its advantages, for instance in simplicity and dealing with non-namespace usecases. And I think we can agree that there is no perfection in all usecase to be gained. lxml.o has its usecases and lxml.o2 will have its usecases. In any of my posts I have pointed out that lxml.o should not be replaced by lxml.o2, and that I opt for a coexistence. I also pointed out that I respected your code - I not even touched it. If something is not perfect it is limited in a way. To motivate my work I find it quite legitimate to point out the limitations of lxml.objectify. I addressed the limitiations of lxml.o2 already in my last post and I am sure there will be more surfacing before I have finalized it. And a few last words on the reception of the ML from my perspective. I had right from the start the feeling that my ideas were not really taken seriously. Many negative arguments like "it was so since 2006" or "it cannot work" were brought up. Even the "but I am strongly biased" club aka "I am one of the developer/maintainer" was waved. Now I would like to come back onto the rational plane and do constructive work together. Cheers, Volker -- ========================================================= inqbus Scientific Computing Dr. Volker Jaenisch Hungerbichlweg 3 +49 (8860) 9222 7 92 86977 Burggenhttps://inqbus.de =========================================================
participants (1)
-
Dr. Volker Jaenisch