[XML-SIG] speed question re DOM parsing

Greg Stein gstein@lyra.org
Fri, 23 Jun 2000 04:14:11 -0700

On Fri, Jun 23, 2000 at 12:12:08PM +0200, Juergen Hermann wrote:
> On Thu, 22 Jun 2000 19:41:53 -0700, Greg Stein wrote:
> >Exactly. Bjorn solved this with StringIO. A timing comparison against
> >string.join is an important test before using either approach.
> The two runs I gave it (on Win/NT)...
> Length of testtext is 1292
>               adding    39.687
>               format    189.71
>                 join    47.034
>            chararray    67.323
>             stringio    33.011
> Length of testtext is 1292
>               adding    40.573
>               format    191.327
>                 join    47.09
>            chararray    65.256
>             stringio    32.65
> The result is obvious, and also what I expected.

well... not so obvious. You're appending characters. I commented out all but
the join and stringio tests, cut the iterations down some, and changed
testtext to read:

testtext = ['x'*1000] * 100

That produced the following numbers:

                join    3.42
	    stringio    4.67

Changing testtext to "testtext = ['x'*100] * 1000" produced:

                join    12.52
	    stringio    10.35

In other words, the fastest mechanism depends on the length of the input
pieces. The balance seems to occur right around 500 characters in my
off-the-cuff tests.

I think that I'd choose cStringIO when present; otherwise choose .join().

Unfortunately, the code would get ugly for that, so it really means going
with one pattern. Assuming that cStringIO is always present is probably best
(it is enabled by default). The plain StringIO package uses .join, so that
is a nice fallback.

oh... and regarding the patch: adding a __getattr__ to the element seems
wrong. I'd recommend instantiating a StringIO in start() and placing it into
the elem instance as _buf. On a call to end(), do a getvalue(), store the
value into first_cdata, and toss the object. (have to toss since there isn't
a common way to "reset and truncate" a StringIO)


Greg Stein, http://www.lyra.org/