[XML-SIG] Replacing a Java tool-chain with 4Suite?
Mike C. Fletcher
mcfletch@rogers.com
Wed, 15 Jan 2003 21:40:04 -0500
Well, I eventually tracked down some of the problems. (With Uche's help).
There were missing /'s in tags of a set of files (which were included
about 4 times via entity references). This was giving "unbalanced tag"
errors when running the 4xslt transform, but not reporting the
(included) file-name correctly (file was reported as being the root
file, so the line-number was just weird, pointing in the middle of the
DTD). Simply writing a Python script that loaded it manually gave me the
names of the files that were causing the problems. If I'm correct,
these missing ETAGs were being ignored by Saxon because they were
reducible markup (in SGML parlance, don't know what it's called with
XML), while the Python parsers required well-formed markup.
When presented with the (opaque) error reports from 4xslt, I'd enabled
validation in the hope that the error report would kick out more
information, and in doing so tickled the DocBook-validation problem, and
thus got side-tracked. (xmlproc_val does report errors on processing the
docbook dtd, I've sent a message to the docbook list to that effect, as
requested).
Regarding speed, I must be mis-using something, or there may be a
problem with our xsl script. I let the command:
4xslt -o test.xml doc\manual\manual.xml doc\xsl\merge.xsl
run for close to three hours before killing it. It was running in
constant memory (about 45MB), with 100% CPU utilisation for the entire
run. Simply loading and pretty-printing the Python-specific template
files (which are smaller than the original API docs, but not 100s of
times smaller) only takes seconds (well, maybe a minute), so I'm
thinking there's something getting into an infinite loop during the
processing.
Given the simplicity of the transformation in this case (just a merge by
section name!) I may write the darned thing in Python to save time (I
started out around 22 hours ago saying to myself "oh, guess I should
regenerate the manual before I start working on the web-site" :) ).
Anyway, thanks all for your efforts,
Mike
Mike Brown wrote:
>Dieter Maurer wrote:
>
>
>>About 18 months ago, I switched from 4Suite to Saxon, because:
>>
>> * 4Suite processed DocBook/XML documents with Norman Walsh's XSLT
>> stylesheets more than 10 times slower than Saxon.
>>
>>
>
>This was resolved in July 2002, in time for the 0.12.0a3 release.
>http://lists.fourthought.com/pipermail/4suite/2002-July/003918.html
>
>4xslt is now faster than Saxon in some cases, slower in others. For a quick
>check I ran just now, using DocBook XSLT 1.55, 4Suite took 18 secs while Saxon
>took 11, on my machine. In current CVS we've got a couple of things that are
>still in flux and that are inflating our processing time, so I expect the gap
>will narrow a bit by the time 0.12.0b1 is released.
>
>
...
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/