[lxml-dev] XPath id function on newly created elements

Hi, I have a document which, after parsing into an ElementTree, I validate against a RELAX NG schema. At this point I can use the XPath id function to find the element with a particular id. However, after adding a new element which has an attribute which the schema defines as holding an ID, that element is not found by the id function. If I validate the document against the schema again, nothing changes (not that I think it should be necessary to do this): the document is valid, and the id function doesn't find the element with the appropriate id. If this is the correct behaviour, how do I get the new ids to be recognised? Jamie -- Artefact Publishing: http://www.artefact.org.nz/ GnuPG Public Key: http://www.artefact.org.nz/people/jamie.html

Hi there, Jamie Norrish wrote:
I'm confused about basic principles here: how do you define an id in a Relax NG schema? I thought that Relax NG does not contain information about this at all. DTDs can have this information, and I thought the new way to do this (as far as I know not supported by lxml..) is the xml:id specification: http://www.w3.org/TR/2005/PR-xml-id-20050712/ I thought Relax NG explicitly didn't do this kind of thing and kept itself to validation only.
I'm still at a loss how you got your original ids recognized in the first place. Perhaps you can provide a small sample XML document, a small sample XPath statement, and, if this is possible, a sample small Relax NG that declares an id. Regards, Martijn

Martijn Faassen writes:
I'm confused about basic principles here: how do you define an id in a Relax NG schema?
<grammar [...] datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> [...] <define name="attlist.mads" combine="interleave"> <attribute name="id"> <data type="ID"/> </attribute> </define> [...] </grammar> seems to be how. (Note, I didn't write the schema, and I'm consequently not particularly in a position to change to using xml:id. However, I will if there's no other way, using lxml, to achieve what I need with new ids being recognised as such - and I can use an XML Schema or a DTD rather than RELAX NG if that makes a difference.) This is certainly what's allowing the XPath id function to work - before validating against the schema it doesn't work, and subsequently it does.
Sample XML: <?xml version="1.0" encoding="utf-8"?> <madsCollection xmlns="http://www.loc.gov/mads/"> <mads id="name-000001"> <authority>Some text</authority> </mads> </madsCollection> Sample XPath: id('name-000001') Sample RELAX NG schema (this should work, but I have just now manually cut out stuff that isn't used in the document above): <?xml version="1.0" encoding="UTF-8"?> <grammar ns="http://www.loc.gov/mads/" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <define name="madsCollection"> <element name="madsCollection"> <ref name="attlist.madsCollection"/> <oneOrMore> <ref name="mads"/> </oneOrMore> </element> </define> <define name="attlist.madsCollection" combine="interleave"> <empty/> </define> <define name="mads"> <element name="mads"> <ref name="attlist.mads"/> <ref name="authority"/> </element> </define> <define name="attlist.mads" combine="interleave"> <attribute name="id"> <data type="ID"/> </attribute> </define> <define name="authority"> <element name="authority"> <ref name="attlist.authority"/> <text/> </element> </define> <define name="attlist.authority" combine="interleave"> <empty/> </define> <start> <ref name="madsCollection"/> </start> </grammar> Jamie -- Artefact Publishing: http://www.artefact.org.nz/ GnuPG Public Key: http://www.artefact.org.nz/people/jamie.html

Martijn Faassen writes:
I thought the new way to do this (as far as I know not supported by lxml..) is the xml:id specification:
Well, just for fun I tried using xml:id, in case that was what it came down to. Sadly that didn't seem to work either, when adding new elements. I then tried using the Python bindings that come with libxml2, and ran into a problem there as well - Daniel Veillard asked me to file a bug on the matter, which you can find at http://bugzilla.gnome.org/show_bug.cgi?id=314358 I'm at a loss as to where to proceed from here, unless using normal IDs will work in lxml with one of RELAX NG/XSD/DTD. I don't know if the problem in the aforementioned bug affects all IDs or not. Jamie -- Artefact Publishing: http://www.artefact.org.nz/ GnuPG Public Key: http://www.artefact.org.nz/people/jamie.html

Hi there, Jamie Norrish wrote:
I'm confused about basic principles here: how do you define an id in a Relax NG schema? I thought that Relax NG does not contain information about this at all. DTDs can have this information, and I thought the new way to do this (as far as I know not supported by lxml..) is the xml:id specification: http://www.w3.org/TR/2005/PR-xml-id-20050712/ I thought Relax NG explicitly didn't do this kind of thing and kept itself to validation only.
I'm still at a loss how you got your original ids recognized in the first place. Perhaps you can provide a small sample XML document, a small sample XPath statement, and, if this is possible, a sample small Relax NG that declares an id. Regards, Martijn

Martijn Faassen writes:
I'm confused about basic principles here: how do you define an id in a Relax NG schema?
<grammar [...] datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> [...] <define name="attlist.mads" combine="interleave"> <attribute name="id"> <data type="ID"/> </attribute> </define> [...] </grammar> seems to be how. (Note, I didn't write the schema, and I'm consequently not particularly in a position to change to using xml:id. However, I will if there's no other way, using lxml, to achieve what I need with new ids being recognised as such - and I can use an XML Schema or a DTD rather than RELAX NG if that makes a difference.) This is certainly what's allowing the XPath id function to work - before validating against the schema it doesn't work, and subsequently it does.
Sample XML: <?xml version="1.0" encoding="utf-8"?> <madsCollection xmlns="http://www.loc.gov/mads/"> <mads id="name-000001"> <authority>Some text</authority> </mads> </madsCollection> Sample XPath: id('name-000001') Sample RELAX NG schema (this should work, but I have just now manually cut out stuff that isn't used in the document above): <?xml version="1.0" encoding="UTF-8"?> <grammar ns="http://www.loc.gov/mads/" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <define name="madsCollection"> <element name="madsCollection"> <ref name="attlist.madsCollection"/> <oneOrMore> <ref name="mads"/> </oneOrMore> </element> </define> <define name="attlist.madsCollection" combine="interleave"> <empty/> </define> <define name="mads"> <element name="mads"> <ref name="attlist.mads"/> <ref name="authority"/> </element> </define> <define name="attlist.mads" combine="interleave"> <attribute name="id"> <data type="ID"/> </attribute> </define> <define name="authority"> <element name="authority"> <ref name="attlist.authority"/> <text/> </element> </define> <define name="attlist.authority" combine="interleave"> <empty/> </define> <start> <ref name="madsCollection"/> </start> </grammar> Jamie -- Artefact Publishing: http://www.artefact.org.nz/ GnuPG Public Key: http://www.artefact.org.nz/people/jamie.html

Martijn Faassen writes:
I thought the new way to do this (as far as I know not supported by lxml..) is the xml:id specification:
Well, just for fun I tried using xml:id, in case that was what it came down to. Sadly that didn't seem to work either, when adding new elements. I then tried using the Python bindings that come with libxml2, and ran into a problem there as well - Daniel Veillard asked me to file a bug on the matter, which you can find at http://bugzilla.gnome.org/show_bug.cgi?id=314358 I'm at a loss as to where to proceed from here, unless using normal IDs will work in lxml with one of RELAX NG/XSD/DTD. I don't know if the problem in the aforementioned bug affects all IDs or not. Jamie -- Artefact Publishing: http://www.artefact.org.nz/ GnuPG Public Key: http://www.artefact.org.nz/people/jamie.html
participants (2)
-
Jamie Norrish
-
Martijn Faassen