RE: Python-Dev digest, Vol 1 #1637 - 11 msgs

Lucky I tuned in. Reportlab has had great success with RXP. We have a python wrapper, pyRXP, with binaries available for several platforms. It is GPLed at present. They wish to keep GPL just in case someone big comes along and wants their code for ten million set-top boxes or something. However, I persuaded them to grant a license to let it be used through the Python binding under Python-like terms, as long as we invent the words and save them having to waste time on lawyers. They would even be happy for it to go into the Python distribution. And we're happy to maintain the wrapper and binaries for several platforms, which we have to do for our customers anyway. If one of the core Python team, who I know have long and painful experience of this stuff, would like to drop me a line, we can probably sort this out in a night. The other thing we found very useful was our representation. We make reports, and ML is a common data source; so our goal is typically to slurp XML into memory as fast as possible, with validation. We eventually hit on a 'tuple tree': each tag is represented as (tagname, attrs, list-of-children, spare) We get there about 6x faster than the fastest alternative parser we know, because all the work is done in C; with typical use of other parsers you call back into Python on every tag. The tree structure is a fraction of the size in memory of what gets created by models using objects for every node. It would be very easy to add this as an alternative interface to expat as well. So then Python users could have a choice of tree or events, and validating or non-validating, all done in C and in the standard distribution. Andy Robinson CEO/Chief Architect, Reportlab Inc.
participants (1)
-
Andy Robinson