[lxml-dev] examine XSD xml schema

Hello, I would like to examine a XSD schema in python. Currently I'm using lxml which is doing it's job very very well when it only has to validate a document against the schema. But, I want to know whats inside of the schema and access the elements in the lxml behavior. The schema: |<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:include schemaLocation="worker_remote_base.xsd"/> <xsd:include schemaLocation="transactions_worker_responses.xsd"/> <xsd:include schemaLocation="transactions_worker_requests.xsd"/> </xsd:schema> | The lxml code to load the schema is (simplyfied): |xsd_file_handle= open( self._xsd_file, 'rb') xsd_text= xsd_file_handle.read() schema_document= etree.fromstring(xsd_text, base_url=xmlpath) xmlschema= etree.XMLSchema(schema_document) | I'm then able to use schema_document (which is etree._Element) to go through the schema as an XML document. But since etree.fromstring (at least it seams like that) expects a XML document the |xsd:include| elements are not processed. The problem is currently solved by parsing the first schema document, then load the include elements and then insert them one by one into the main document by hand: |BASE_URL= "/xml/" schema_document= etree.fromstring(xsd_text, base_url=BASE_URL) tree= schema_document.getroottree() schemas= [] for schemaChildin schema_document.iterchildren(): if schemaChild.tag.endswith("include"): try: h= open(os.path.join(BASE_URL, schemaChild.get("schemaLocation")), "r") s= etree.fromstring(h.read(), base_url=BASE_URL) schemas.append(s) except Exception as ex: print "failed to load schema: %s" % ex finally: h.close() # remove the<xsd:include ...> element self._schema_document.remove(schemaChild) for sin schemas: # inside<schema> for sChildin s: schema_document.append(sChild) | What I'm asking for is an idea how to solve the problem by using a more common way. I've already searched for other schema parsers in python but for now there was nothing that would fit in that case. Greetings, Michael

On Tue, Apr 27, 2010 at 02:01:53PM +0200, Michael Konietzny wrote: [snip]
You might be interested in looking at process_includes.py. It is part of the generateDS.py package. It reads and parses an XML Schema (using lxml, of course) and replaces the include and import elements in that document with the included document. process_includes.py is imported and used by generateDS.py itself, but you can also run it from the command line. You can find the latest version of generateDS.py here: http://www.rexx.com/~dkuhlman/#generateds-py-generate-python-data-bindings-f... or at the Python project index: http://pypi.python.org/pypi If you try it, let me know if you have questions, suggestions, etc.. - Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman

On Tue, Apr 27, 2010 at 02:01:53PM +0200, Michael Konietzny wrote: [snip]
You might be interested in looking at process_includes.py. It is part of the generateDS.py package. It reads and parses an XML Schema (using lxml, of course) and replaces the include and import elements in that document with the included document. process_includes.py is imported and used by generateDS.py itself, but you can also run it from the command line. You can find the latest version of generateDS.py here: http://www.rexx.com/~dkuhlman/#generateds-py-generate-python-data-bindings-f... or at the Python project index: http://pypi.python.org/pypi If you try it, let me know if you have questions, suggestions, etc.. - Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman
participants (2)
-
Dave Kuhlman
-
Michael Konietzny