[lxml-dev] Relax NG validation question

I've been using jing to validate documents against a schema written in the RelaxNG compact syntax. I'd like to use lxml to validate instead of using jing. After converting the schema to the XML syntax using trang, I'm loading the schema like so:
I suspect something about the include structure of my schema is causing lxml this heartburn. I've reproduced this with a cut-down version of the schema, but I'm using the same include structure as the original, since I suspect that's essential (and I'd like to retain that instead of restructuring the actual schema). My schema is written as three files (attached): base.rng This is really just a very-slightly-modified copy of the Atom schema. It defines an "anyElement" pattern. This can be loaded on it's own. middle.rng This includes and extends base.rng. It makes no specific mention of the "anyElement" pattern. This can be loaded on it's own. top.rng This is the top-level schema that I actually care about validating; it overrides the "anyElement" pattern as part of the include of middle.rng. Building the RelaxNG object from this document causes the exception shown above. Any ideas? I'm using Python 2.6.5 and lxml 2.2.6, libxml2 2.7.7, and libxslt 1.1.26. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "Chaos is the score upon which reality is written." --Henry Miller

On Tue, May 4, 2010 at 12:09 PM, Fred Drake <fdrake@acm.org> wrote:
I've not seen any responses to this; has anyone else had similar experiences with RelaxNG includes? I'd really like to be able to use lxml in my application, instead of requiring a Java runtime, but this approaches being a blocker. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein

Fred Drake, 04.05.2010 18:09:
I get the same error with xmllint. This may indicate a problem in libxml2. I never used the override feature that you deploy in the top.rng file, but the error indicates that libxml2 expects "anyElement" to be a grammar rule defined in the included RNG file that the including file overrides. My guess is that the existence test happens before processing the inner import in middle.rng, so that the overridden element isn't included yet. The RNG spec isn't particularly clear here and does not even mention this specific case. http://www.relaxng.org/spec-20011203.html#IDAG3YR This is a very special case that is worth bringing to the attention of the libxml2 mailing list.
Any ideas?
You can try running the include in a different tool and loading the resulting RNG instead. No idea what to use for this, though, trang doesn't do it, for example. You can also try applying the includes yourself after parsing the document and before handing it to RelaxNG(). Shouldn't be too hard now that it's clear what you have to take care of. A recursive depth-first include processor would do the job just fine. Stefan

On Tue, May 4, 2010 at 12:09 PM, Fred Drake <fdrake@acm.org> wrote:
I've not seen any responses to this; has anyone else had similar experiences with RelaxNG includes? I'd really like to be able to use lxml in my application, instead of requiring a Java runtime, but this approaches being a blocker. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein

Fred Drake, 04.05.2010 18:09:
I get the same error with xmllint. This may indicate a problem in libxml2. I never used the override feature that you deploy in the top.rng file, but the error indicates that libxml2 expects "anyElement" to be a grammar rule defined in the included RNG file that the including file overrides. My guess is that the existence test happens before processing the inner import in middle.rng, so that the overridden element isn't included yet. The RNG spec isn't particularly clear here and does not even mention this specific case. http://www.relaxng.org/spec-20011203.html#IDAG3YR This is a very special case that is worth bringing to the attention of the libxml2 mailing list.
Any ideas?
You can try running the include in a different tool and loading the resulting RNG instead. No idea what to use for this, though, trang doesn't do it, for example. You can also try applying the includes yourself after parsing the document and before handing it to RelaxNG(). Shouldn't be too hard now that it's clear what you have to take care of. A recursive depth-first include processor would do the job just fine. Stefan
participants (2)
-
Fred Drake
-
Stefan Behnel