Hi Holger!

Thank you very much for the fast response.

Am 28.02.22 um 08:41 schrieb Holger.Joukl@LBBW.de:
The reason for this is that obviously {http://www.isotc211.org/2005/gco}CharacterString is not a valid Python identifier and it makes sense
to restrict unqualified lookup to children from the same namespace.

I like to disagree on

and it makes sense
to restrict unqualified lookup to children from the same namespace

What does the namespace of a node has in common with the namespace of one of its subnodes? Nothing. It is quite common in XML that you borrow from other namespaces.

Other namespace based python libs like for instance RDFlib solve this problem generically by adding the namespace to the python property.

{http://www.isotc211.org/2005/gco}CharacterString   -> gco_CharacterString
This works like a charm. Not once I had a corner-case.

The problem lies deeply burrowed in the nature of LXML objectify implementation. Objectify does not really transform the XML into a real python instance hierarchy (as RDFlib does), but directs all attribute access via function calls to the C-libxml core. This is on one hand a desired behavior since one so can change XML on-the-fly and some of the changes are visible as well in the XML as also in the objectified representation.
But on the other hand the information what namespace a node belongs to is not persistent in the node and therefore cannot be used for lookup. 

This can easily be seen in lxml/objectivy.pyx line 414ff:

cdef tree.xmlNode* _findFollowingSibling(tree.xmlNode* c_node,
                                         const_xmlChar* href, const_xmlChar* name,
                                         Py_ssize_t index):
    cdef tree.xmlNode* (*next)(tree.xmlNode*)
    if index >= 0:
        next = cetree.nextElement
    else:
        index = -1 - index
        next = cetree.previousElement
    while c_node is not NULL:
        if c_node.type == tree.XML_ELEMENT_NODE and \
               _tagMatches(c_node, href, name):
            index = index - 1
            if index < 0:
                return c_node
        c_node = next(c_node)
    return NULL

To find the desired sibling the code loops over all childern and matches (parentNamespace, propertyName) against them.

The correct operation of _findFollowingSibling should IMHO be: 

Make a lookup on all children (with the python property name only). If one match is found then return this match. If none or more than one match is found then no answer is possible.

I extended _findFollowingSibling to

cdef tree.xmlNode* _findFollowingSibling(tree.xmlNode* c_node,
                                         const_xmlChar* href, const_xmlChar* name,
                                         Py_ssize_t index):
    cdef tree.xmlNode* (*next)(tree.xmlNode*)
    cdef tree.xmlNode* start_node
    cdef tree.xmlNode* result_node
    cdef int found = 0

    start_node = c_node
    if index >= 0:
        next = cetree.nextElement
    else:
        index = -1 - index
        next = cetree.previousElement
    # search with namespace
    while c_node is not NULL:
        if c_node.type == tree.XML_ELEMENT_NODE and \
               _tagMatches(c_node, href, name):
            index = index - 1
            if index < 0:
                return c_node
        c_node = next(c_node)
    # search without namespace
    c_node = start_node
    while c_node is not NULL:
        if c_node.type == tree.XML_ELEMENT_NODE and c_node.name == name:
            index = index - 1
            if index < 0:
                result_node = c_node
                found += 1
        c_node = next(c_node)
    # check if only one result is found
    if found == 1:
        return result_node
    return NULL

Sorry for my clumsy Cython. But it works perfectly well. I also preserved the notion to look up in the parent namespace first.

>>> node.fileIdentifier.CharacterString
'4157d397-e2c3-4e6e-8a84-0712aa9c1162'

I would really like if someone may test this https://github.com/Inqbus/lxml Branch better-objectify-attributes proof of concept. 
When getting positive answers I would come up with a pull request.

Cheers,
Volker


-- 
=========================================================
   inqbus Scientific Computing    Dr.  Volker Jaenisch
   Hungerbichlweg 3               +49 (8860) 9222 7 92
   86977 Burggen                     https://inqbus.de
=========================================================