[Expat-discuss] [ANN] VTD-XML Version 1.5 Released

Karl Waclawek karl at waclawek.net
Mon Mar 6 21:41:06 CET 2006


Jimmy Zhang wrote:
> VTD is not DOM compressed...
> VTD-XML significantly outperforms Expat with NULL content 
> handler, 
Which benchmark shows that result?
I ran 3 test files through benchmark(_BR).java, and Expat's benchmark 
(no handlers, 256k working buffer).
Environment: Windows XP, jre 1.5.0_04 (client, don't have server), 
Athlon 2500, 1GB RAM.

File 1: 1.1 MB, UTF-8, lots of markup:

benchmark.java: 67ms
benchmark_BR.java: 51ms
Expat: 53ms (54ms with namespace processing on)

File 2: 1.7MB, UTF-16, lots of PCDATA:

benchmark.java: 33ms
benchmark_BR.java: 32ms
Expat: 17.5ms (17.8ms with namespace processing on)

File 3: 198KB, ISO-8859-1 (REC-xml-20001006.xml, the XML specification 
as XML file)

VTD-XML failed parsing the XML specification (which is an XML document) with
error:   Not wellformed -->com.ximpleware.ParseException: Other Error: 
Unrecognized
ch after <! Line Number: 16 Offset: 3

The Expat benchmark re-allocates a new parser for each iteration, which 
could be further
optimized using XML_ParserReset(), but won't make much of a difference 
for files of that size.
Also, at this point, Expat has already converted all input characters to 
UTF-8.

I did not see VTD-XML outperform Expat at all, especially not in a 
*significant* way.
Still, its performance is quite good (in a benchmark). Practical 
applications
may show a different picture, once you actually have to access the content.

Of course there are still questions:
- How many test cases in the XML Test-Suite does VTD-XML pass?
- How complete is DTD processing?

Karl



More information about the Expat-discuss mailing list