[Python-Dev] Performance of various marshallers

Paul Prescod paul@ActiveState.com
Tue, 02 Oct 2001 18:11:47 -0700


Skip Montanaro wrote:
> 
>...
> 
> Bad analogy.  CGI scripts can contain the entire realm of "stuff" that goes
> into any other Python program.  XML-RPC encodings can't contain arbitrary
> XML tags or attributes.  A better analogy would have been (Martin's I think)
> hypothetical Swallow - a subset of Python that could be efficiently
> compiled.

But there is no evidence that this subset of XML can be more efficiently
parsed than any other. XML parsing consists primarily of recognizing
angle brackets and a few other characters, and passing around some extra
data. Any performance loss from a "full" XML parser will shrink as
people submit bug reports that require a "simplified" XML parser to
conform to the XML spec (Unicode, CDATA, etc.).

I strongly agree that a dedicated C-written XML-RPC implementation can
be faster than one written based on Python and Expat. I haven't yet seen
evidence that you can both conform with the standards and get much of a
speedup over one that is built on a fast XML Parser such as Eric Kidd's
XML-RPC C or xmlrpc-epi (both on SourceForge).

>...
> Paul, you have to stop looking at XML-RPC with your Elton John-style
> XML-colored glasses.  XML-RPC is not meant to be some sort of highly
> structured hierarchical data representation that you can sniff around in
> with arbitrary XML tools of one sort or another. That its on-the-wire
> representation happens to be XML is almost ridiculously unimportant.  

XML-RPC uses XML for exactly the same reason every other application of
XML uses XML. Precisely so that you will not have to write yet another
parser for it. That's the central reason *for* XML. That's the only
advantage XML has over cPickle -- that you can be sure whatever language
you have, it will have an XML parser available built in.

> Fine.  I'm sure Shilad appreciates the input.  I think your 
> approach to bug detection and reporting could have been a bit 
> less heavy handed.

I'm not trying to embarrass Shilad. The software isn't at 1.0 yet. Maybe
he hasn't got around to choosing an XML parser. 

I'm trying to point out (more to you, than to him!) that there is a good
reason to build on the work other people have done. If pyxmlrpc is
faster today it is probably because it doesn't conform to the specs.
When it does conform, it won't be faster anymore.

> As for handling things like CDATA, UTF-16 and extra whitespace after tag
> names, I suspect some other XML-RPC packages would exhibit similar problems
> if they were exposed to a standards-toting XML gunslinger like yourself.
> That it's not a problem in practice is probably because the set of XML-RPC
> encoding and decoding software is fairly small and that the stuff that
> encodes into XML-RPC is fairly well-behaved.

Every XML-RPC implementation I have ever used (Python, Perl, C, C++,
PHP) is based upon one pure XML parser or another. Most use Expat.

 Paul Prescod