relaxng validation error

hi, python 2.7, lxml2.3, freebsd amd64 I'm using lxml to validate against the DocBook v5 relaxng schema. I get an error that says "Expecting element set, got chapter, line 2", but the actual schema seems to allow chapters to be starting elements; I've appended the relevant snippets below. First, my test code: --------------------------- from lxml import etree tree = etree.parse('dbtest.xml') relaxng = etree.RelaxNG(file='/path_to/DocBook/V5.0/rng/docbookxi.rng') relaxng.assertValid(tree) --------------------------- results in: File "lxml.etree.pyx", line 3006, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:125415) lxml.etree.DocumentInvalid: Expecting element set, got chapter, line 2. the dbtest.xml starts out with this line: --------------------------- <chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0"> --------------------------- As far as I can tell, the rng says chapters are okay. The first bit of the rng: <start> <choice> <choice> <ref name="db.set"/> <ref name="db.book"/> <ref name="db.divisions"/> <ref name="db.components"/> etc. and later on db.components is defined: <define name="db.components"> <choice> <choice> <ref name="db.dedication"/> <ref name="db.acknowledgements"/> <ref name="db.preface"/> <ref name="db.chapter"/> etc. thanks for reading. Is this a bug? --Tim Arnold

On Mon, Mar 28, 2011 at 1:18 PM, Tim Arnold <Tim.Arnold@sas.com> wrote:
I was able to create a simple DocBook 5.0 file with 'chapter' as the top-level element that I could validate with both lxml/libxml2 (as per your example) and jing, so I don't think that the problem is really 'chapter'. My impression is that RNG's non-determinism often leads validators to give misleading error messages. Sometimes an error at some level of nesting will cause an ancestor element to be reported as unexpected rather than the actual problem element or attribute. In my experience, jing [1] gives more useful RNG validation error messages than libxml2, so you might try validating your document with Jing. Or start commenting out descendents of 'chapter' to see if you can identify the problem by shutting it up. [1] <http://code.google.com/p/jing-trang/> Chuck

Thanks Chuck, I believe you're right. I can validate the same file using xerces and the docbook nvdl.jar so I guess the problem with using lxml is the mixture of relaxng and schematron rules. I wanted to stay in Python for everything but using java seems like the way forward for now. thanks for your info! --Tim

On Mon, Mar 28, 2011 at 1:18 PM, Tim Arnold <Tim.Arnold@sas.com> wrote:
I was able to create a simple DocBook 5.0 file with 'chapter' as the top-level element that I could validate with both lxml/libxml2 (as per your example) and jing, so I don't think that the problem is really 'chapter'. My impression is that RNG's non-determinism often leads validators to give misleading error messages. Sometimes an error at some level of nesting will cause an ancestor element to be reported as unexpected rather than the actual problem element or attribute. In my experience, jing [1] gives more useful RNG validation error messages than libxml2, so you might try validating your document with Jing. Or start commenting out descendents of 'chapter' to see if you can identify the problem by shutting it up. [1] <http://code.google.com/p/jing-trang/> Chuck

Thanks Chuck, I believe you're right. I can validate the same file using xerces and the docbook nvdl.jar so I guess the problem with using lxml is the mixture of relaxng and schematron rules. I wanted to stay in Python for everything but using java seems like the way forward for now. thanks for your info! --Tim
participants (2)
-
Chuck Bearden
-
Tim Arnold