ReportLab are proud to present pyRXP version 0.7 - probably(*) the fastest validating XML parser available for Python. http://www.reportlab.com/xml/pyrxp.html RXP is a very fast and fully compliant validating XML parser written by Richard Tobin of the University of Edinburgh, Language Technology Group. pyRXP is a wrapper around this which constructs a lightweight in-memory "tuple tree" in a single call. This structure is the lightest one we could define in Python, and it is constructed entirely in C code, resulting in unprecedented speed; the memory footprint is also several times more compact that DOM Node objects in either Python or Java. The deployment is a single Python extension module of approximately 100kb. PyRXP, like RXP is under the General Public License. Commercial licenses are available from ReportLab for situations where GPL is not appropriate, such as embedding in closed source products. This is not a full DOM implementation. But if you need to get XML data into memory, we think it will do what 90% of the people want, in 10% of the time. And with validation. Enjoy! Andy Robinson CEO/Chief Architect, ReportLab Inc. * We have been informed that Daniel Viellard's Python wrapper for libxml2 may be a contender, but have not been able to do a comparable benchmark. No parser using Python SAX events even comes close.
participants (1)
-
Andy Robinson