Mailman 3 [lxml-dev] examine XSD xml schema - lxml - The Python XML Toolkit

April 27, 2010

      Hello,

I would like to examine a XSD schema in python. Currently I'm using lxml 
which is doing it's job very very well when it only has to validate a 
document against the schema. But, I want to know whats inside of the 
schema and access the elements in the lxml behavior.

The schema:

|<?xml version="1.0"?>
<xsd:schema  xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     <xsd:include  schemaLocation="worker_remote_base.xsd"/>
     <xsd:include  schemaLocation="transactions_worker_responses.xsd"/>
     <xsd:include  schemaLocation="transactions_worker_requests.xsd"/>
</xsd:schema>
|

The lxml code to load the schema is (simplyfied):

|xsd_file_handle=  open(  self._xsd_file,  'rb')
xsd_text=  xsd_file_handle.read()
schema_document=  etree.fromstring(xsd_text,  base_url=xmlpath)
xmlschema=  etree.XMLSchema(schema_document)
|

I'm then able to use schema_document (which is etree._Element) to go 
through the schema as an XML document. But since etree.fromstring (at 
least it seams like that) expects a XML document the |xsd:include| 
elements are not processed.

The problem is currently solved by parsing the first schema document, 
then load the include elements and then insert them one by one into the 
main document by hand:

|BASE_URL=  "/xml/"
schema_document=  etree.fromstring(xsd_text,  base_url=BASE_URL)
tree=  schema_document.getroottree()

schemas=  []
for  schemaChildin  schema_document.iterchildren():
     if  schemaChild.tag.endswith("include"):
         try:
             h=  open(os.path.join(BASE_URL,  schemaChild.get("schemaLocation")),  "r")
             s=  etree.fromstring(h.read(),  base_url=BASE_URL)
             schemas.append(s)
         except  Exception  as  ex:
             print  "failed to load schema: %s"  %  ex
         finally:
             h.close()
         # remove the<xsd:include ...>  element
         self._schema_document.remove(schemaChild)

for  sin  schemas:
# inside<schema>
     for  sChildin  s:
         schema_document.append(sChild)
|

What I'm asking for is an idea how to solve the problem by using a more 
common way. I've already searched for other schema parsers in python but 
for now there was nothing that would fit in that case.

Greetings,
Michael

[lxml-dev] examine XSD xml schema

Michael Konietzny

Dave Kuhlman

Dave Kuhlman

tags

participants (2)