Minimal example of lxml writing an XML file and then reading and validating it against an XSD

Hi, My name is Duane Kaufman, and I am new to this list (please be gentle :) I am new to XML (not to Python), and I am wrestling with trying to get lxml to perform a task for me (I am on Windows XP, Python 2.7 lxml 2.3). I want to: 1) Use lxml to create an XML file 2) (Manually) create an XML Schema (XSD) file for the created XML file 3) Use lxml to read the XML file, validating it against the XSD file from 2) I have tried: MyXMLWriter.py: #------------------------- ScriptRootDir = r'H:\My Documents\Manufacturing\my_python\XML_Test\XMLSchema_test' def main(): global ScriptRootDir # Script to test the use of package lxml to pass XML messages containing data # This script will write out an XML file which will be a message with data from lxml import objectify from lxml import etree import os root = objectify.Element("root") root.StationName = objectify.DataElement("BaseDispensing", _xsi="string") root.MessageType = objectify.DataElement("BasePartUnloadedData", _xsi="string") objectify.deannotate(root, xsi=False) print(etree.tostring(root, pretty_print=True, xml_declaration=True)) et = etree.ElementTree(root) fp = open(os.path.join(ScriptRootDir, "Test.xml"), "w") et.write(fp, pretty_print=True, xml_declaration=True) fp.close() if __name__ == '__main__': main() #------------------------ Output from MyXMLWriter.py (Test.xml): #------------------------ <?xml version='1.0' encoding='ASCII'?> <root xmlns:py="http://codespeak.net/lxml/objectify/pytype":xsd="http://www.w3 .org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <StationName xsi:type="xsd:string">BaseDispensing</StationName> <MessageType xsi:type="xsd:string">BasePartUnloadedData</MessageType> </root> #----------------------- Test.xsd: #------------------ <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> </xsd:schema> #------------------ MyXMLReader.py: #------------------ ScriptRootDir = r'H:\My Documents\Manufacturing\my_python\XML_Test\XMLSchema_test' XMLSchemaFilename = "Test2.xsd" def main(): global ScriptRootDir # Script to test the use of package lxml to pass XML messages containing data # This script will read in an XML file which will be a message with data # into a Python data structure from lxml import etree import os # read and parse XML Schema file xsd_doc = etree.parse(XMLSchemaFilename) xsd = etree.XMLSchema(xsd_doc) MessageObject = etree.parse(os.path.join(ScriptRootDir, "Test.xml")) xsd.validate(MessageObject) print xsd.error_log if __name__ == '__main__': main() #------------------------ When running MyXMLReader.py, I get the output: file:///H:/My%20Documents/Manufacturing/my_python/XML_Test/XMLSchema_tes t/Test.xml:1:0:ERROR:SCHEMASV:SCHEMAV_CVC_ELT_1: Element 'root': No matching global declaration available for the validation root. How do I get this to work from end-to-end? Thanks in advance, Duane

Duane Kaufman, 14.12.2011 22:02:
My name is Duane Kaufman, and I am new to this list (please be gentle :)
Welcome! Sorry for the late reply. I guess a lot of people on this list just have been busy before the holidays.
I am new to XML (not to Python), and I am wrestling with trying to get lxml to perform a task for me (I am on Windows XP, Python 2.7 lxml 2.3).
I want to:
1) Use lxml to create an XML file 2) (Manually) create an XML Schema (XSD) file for the created XML file 3) Use lxml to read the XML file, validating it against the XSD file from 2)
I have tried: MyXMLWriter.py: #------------------------- ScriptRootDir = r'H:\My Documents\Manufacturing\my_python\XML_Test\XMLSchema_test'
def main(): global ScriptRootDir # Script to test the use of package lxml to pass XML messages containing data # This script will write out an XML file which will be a message with data from lxml import objectify from lxml import etree import os root = objectify.Element("root") root.StationName = objectify.DataElement("BaseDispensing", _xsi="string") root.MessageType = objectify.DataElement("BasePartUnloadedData", _xsi="string") objectify.deannotate(root, xsi=False) print(etree.tostring(root, pretty_print=True, xml_declaration=True)) et = etree.ElementTree(root) fp = open(os.path.join(ScriptRootDir, "Test.xml"), "w") et.write(fp, pretty_print=True, xml_declaration=True) fp.close()
if __name__ == '__main__': main() #------------------------
Output from MyXMLWriter.py (Test.xml): #------------------------ <?xml version='1.0' encoding='ASCII'?> <root xmlns:py="http://codespeak.net/lxml/objectify/pytype":xsd="http://www.w3 .org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <StationName xsi:type="xsd:string">BaseDispensing</StationName> <MessageType xsi:type="xsd:string">BasePartUnloadedData</MessageType> </root> #-----------------------
Test.xsd: #------------------ <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> </xsd:schema> #------------------
MyXMLReader.py: #------------------ ScriptRootDir = r'H:\My Documents\Manufacturing\my_python\XML_Test\XMLSchema_test' XMLSchemaFilename = "Test2.xsd"
def main(): global ScriptRootDir # Script to test the use of package lxml to pass XML messages containing data # This script will read in an XML file which will be a message with data # into a Python data structure from lxml import etree import os
# read and parse XML Schema file xsd_doc = etree.parse(XMLSchemaFilename) xsd = etree.XMLSchema(xsd_doc)
MessageObject = etree.parse(os.path.join(ScriptRootDir, "Test.xml")) xsd.validate(MessageObject) print xsd.error_log
if __name__ == '__main__': main() #------------------------
When running MyXMLReader.py, I get the output: file:///H:/My%20Documents/Manufacturing/my_python/XML_Test/XMLSchema_tes t/Test.xml:1:0:ERROR:SCHEMASV:SCHEMAV_CVC_ELT_1: Element 'root': No matching global declaration available for the validation root.
How do I get this to work from end-to-end?
Your code is perfectly ok. The problem is that your schema document is empty. The validator tries to find a match between the XML document (specifically, its root element) and the schema, and that fails. If you use a proper (non-empty) schema, it will work. BTW, since you mentioned that you are new to XML, consider using RelaxNG as a schema language instead of XML-Schema. It's vastly more user friendly if you plan to write the schema manually (or even just want to read it). In case you prefer a non-XML notation, RelaxNG also has a readable textual syntax (RNC) that you can convert into the XML notation (RNG, which lxml/libxml2 can process) using a tool like trang. Personally, I find RNC quite nice to write, and I hugely prefer it over XML-Schema wherever I can. Stefan
participants (2)
-
Duane Kaufman
-
Stefan Behnel