[XML-SIG] Bug in exception handling?
Rob Hooft
r.hooft@euromail.net
Thu, 24 Jun 1999 14:37:23 +0200 (MZT)
>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
FL> Rob Hooft <r.hooft@euromail.net> wrote:
>> Bypassing sax altogether and using pyexpat directly reduces parsing
>> time with 40%. 45 seconds on a "moderately sized" file (some of my
>> clients have files that are going to be 20 times bigger still,
>> i.e. 60MB of XML) is still considerably long, so I'll need to speed it
>> up a bit more to make it really usable.
FL> with a little luck, you might be able to use sgmlop instead
FL> (it cannot handle all possible XML constructs yet, but it
FL> might work on your material).
FL> here's a simple benchmark, run on an old 200 MHz pentium
FL> box, under NT:
>> dir big.xml
FL> 99-06-24 13:47 62 078 532 big.xml
>> python benchxml.py big.xml
FL> sgmlop/null parser: 8.567 seconds; 7246131 bytes per second
FL> sgmlop/dummy parser: 51.943 seconds; 1195134 bytes per second
FL> ^C
I'm using a 200MHz pentium as well, but I think the biggest problem
is the kind of data I'm handling. It is mostly numerical. We're
still working on the DTD, but I can show you a typical fragment:
...
<REFLECTION NR="14" BATCH="1">
<INDEX H="-7" K="-3" L="7"/>
<INTENSITY I="8384.55" SIGMA="25.05"/>
<IMPACT HOR="-5.24" VER="-20.09" ROT="-163.146"/>
</REFLECTION>
<REFLECTION NR="15" BATCH="1">
<INDEX H="-9" K="-3" L="8"/>
<INTENSITY I="40.61" SIGMA="4.05"/>
<IMPACT HOR="0.608" VER="-23.893" ROT="-163.24"/>
<FLAG>
<WEAK/>
</FLAG>
</REFLECTION>
<REFLECTION NR="16" BATCH="1">
<INDEX H="-4" K="5" L="2"/>
<INTENSITY I="66.57" SIGMA="2.5"/>
<IMPACT HOR="-9.787" VER="10.048" ROT="-163.12"/>
</REFLECTION>
...
I think a large part of my time with any parser will be spent in
atof() and atoi().... I'll try sgmlop as soon as I can.
Rob
--
===== R.Hooft@EuroMail.net http://www.xs4all.nl/~hooft/rob/ =====
===== R&D, Nonius BV, Delft http://www.nonius.nl/ =====
===== PGPid 0xFA19277D ========================== Use Linux! =========