On 22.11.2017, 13:14 PM, Charlie Clark wrote:
 
> Am .11.2017, 12:02 Uhr, schrieb Frank Millman <frank@chagford.com>:
>

>
> lxml.etree.DocumentInvalid: Element 
>
> '{http://www.omg.org/spec/BPMN/20100524/MODEL}dataOutput', attribute 
>
> 'id': 'user_row_id' is not a valid value of the atomic type 'xs:ID'., 
>
> line 38
>
> Is this a bug? Should I always use the second method
>

>
It would help to provide an example of the schema and the XML you're 
>
trying to validate and the code you're using to do this. The behaviour of 
>
the various validation methods is subtly different, with assertValid() the 
>
only one that tells you why something is not valid. Validating when 
>
parsing might write the errors to the error log.
>

>
Internally I think lxml hands the work to libXML2 so that if there is a 
>
bug, it's most likely to be there. But we really need more information 
>
about what exactly you're doing.
>
 
I tried to reduce this to a simple example, but in that case the parsing validator correctly picked up the duplicate id, so I see what
you mean about ‘subtly different’.
 
I don’t want to waste anyone’s time on this, as this is not critical to me. However, for interest, here is a bit of info.
 
The schema is large and complex (to me, anyway). There are a number of  xsd files. The links can be found here - http://www.omg.org/spec/BPMN/2.0/About-BPMN/
 
The example that I used for testing is as follows -
 
<xsd:element name="scriptTask" type="tScriptTask" substitutionGroup="flowElement"/>
<xsd:complexType name="tScriptTask">
    <xsd:complexContent>
        <xsd:extension base="tTask">
            <xsd:sequence>
                <xsd:element ref="script" minOccurs="0" maxOccurs="1"/>
            </xsd:sequence>
            <xsd:attribute name="scriptFormat" type="xsd:string"/>
        </xsd:extension>
    </xsd:complexContent>
</xsd:complexType>       
 
As you can see, it comes from flowElement -
 
<xsd:element name="flowElement" type="tFlowElement"/>
<xsd:complexType name="tFlowElement" abstract="true">
    <xsd:complexContent>
        <xsd:extension base="tBaseElement">
            <xsd:sequence>
                <xsd:element ref="auditing" minOccurs="0" maxOccurs="1"/>
                <xsd:element ref="monitoring" minOccurs="0" maxOccurs="1"/>
                <xsd:element name="categoryValueRef" type="xsd:QName" minOccurs="0" maxOccurs="unbounded"/>
            </xsd:sequence>
            <xsd:attribute name="name" type="xsd:string"/>
        </xsd:extension>
    </xsd:complexContent>
</xsd:complexType>
 
This one comes from baseElement -
 
<xsd:element name="baseElement" type="tBaseElement"/>
<xsd:complexType name="tBaseElement" abstract="true">
    <xsd:sequence>
        <xsd:element ref="documentation" minOccurs="0" maxOccurs="unbounded"/>
        <xsd:element ref="extensionElements" minOccurs="0" maxOccurs="1" />
    </xsd:sequence>
    <xsd:attribute name="id" type="xsd:ID" use="optional"/>
    <xsd:anyAttribute namespace="##other" processContents="lax"/>
</xsd:complexType>
 
baseElement defines the “id” attribute that is causing the problem.
 
In my xml file, I have this -
 
<semantic:scriptTask id="task_AfterLogin" name="AfterLogin task">
    [...]
</semantic:scriptTask>
 
<semantic:scriptTask id="task_CancelLogin" name="CancelLogin task">
    [...]
</semantic:scriptTask>
 
To test, I simply changed the second id to be the same as the first one.
 
This is the code that I used -
 
1. Validate while parsing
 
    parser = etree.XMLParser(
        schema=etree.XMLSchema(file='bpmn20/BPMN20.xsd'),
        attribute_defaults=True, remove_comments=True, remove_blank_text=True)
 
    xml = open('login_proc.xml').read()
    elem = etree.fromstring(xml, parser=parser)
 
This did not pick up the error.
 
2. Parse, then validate
 
    parser = etree.XMLParser(
        attribute_defaults=True, remove_comments=True, remove_blank_text=True)
    schema=etree.XMLSchema(file='bpmn20/BPMN20.xsd')
 
    xml = open('login_proc.xml').read()
    elem = etree.fromstring(xml, parser=parser)
    schema.assertValid(elem)
 
This did pick it the error.
 
Comments welcome, but as I said, this is not critical to me.
 
Thanks
 
Frank