[XML-SIG] Pyana 0.2.0 released
Brian Quinlan
brian@sweetapp.com
Mon, 17 Dec 2001 18:41:10 -0800
This is a multi-part message in MIME format.
------=_NextPart_000_0036_01C1872A.6648D570
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Uche Ogbuji wrote:
> > PIRXX is focused on providing Xerces XML services to Python. The
current
> > release of PIRXX provides SAX2 interfaces but I believe that =
J=FCrgen
is
> > working on DOM support.
> >
> > So, right now, Pyana is probably your best bet for high-performance
XSLT
> > processing in Python while PIRXX offers Xerces SAX2 interfaces.
>=20
> Are you basing this on actual benchmarks? In particular, I'd be
surprised
> if Pyana was faster overall than current CVS of 4XSLT, Since Xalan
isn't,
> as I measure it.
I am basing this on the timings of largish transformations that I was
doing around 2 months ago. Since then I haven't really compared them and
I have never run any formal benchmarks.=20
Note that one of the big problems with timing Xalan from the command
line is that it is very slow to load, especially on windows. I just
timed "import Pyana" on my PIV 1.7GHz and it took 0.74s. But the beauty
of using Pyana instead of something like "popen('xalan ..." is that the
load time becomes a one-time cost for the application.
For fun, I just downloaded:
http://www.datapower.com/XSLTMark/download/XSLTMark_2_1_0.zip
And wrote the attached script. I did this without expending any effort
trying to understand the benchmark suite; I just test each .xml/.xsl
pair. Notice that all of the source/stylesheet documents are small so
the advantage should go to 4suite.
I don't want to get 4suite from CVS so why don't you get Pyana:
http://prdownloads.sourceforge.net/pyana/Pyana-0.2.0.win32-py2.1.exe
(very easy Windows installer for Python 2.1 [you can probably figure out
how to transform that URL for Python 2.0 ;-)])
Then you can run this script against Pyana and (with a few tweeks)
against 4suite. Here are Pyana's results on my machine:
C:\Dev\Me\Pyana\pyana\Test>python benchmark.py
time to import Pyana: 0.0785s # Cached by Windows?
time to execute axis: 0.0080s (675 bytes of output)
time to execute bottles: 0.0145s (12075 bytes of output)
time to execute brutal: 0.0156s (4191 bytes of output)
time to execute chart: 0.0110s (3837 bytes of output)
time to execute current: 0.0060s (320 bytes of output)
time to execute game: 0.0080s (457 bytes of output)
time to execute html: 0.0066s (504 bytes of output)
time to execute identity: 0.0043s (218 bytes of output)
time to execute inventory: 0.0088s (2070 bytes of output)
time to execute metric: 0.0132s (640 bytes of output)
time to execute number: 0.0074s (788 bytes of output)
time to execute oddtemplate: 0.0071s (173 bytes of output)
time to execute priority: 0.0083s (587 bytes of output)
time to execute products: 0.0085s (439 bytes of output)
time to execute queens: 0.0900s (1772 bytes of output)
time to execute tower: 0.1555s (70729 bytes of output)
time to execute trend: 0.0513s (8382 bytes of output)
time to execute union: 0.0058s (128 bytes of output)
time to execute xpath: 0.0062s (225 bytes of output)
time to execute xslbench1: 0.0204s (7011 bytes of output)
Cheers,
Brian
------=_NextPart_000_0036_01C1872A.6648D570
Content-Type: text/plain;
name="benchmark.py"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="benchmark.py"
import time
import os
# Find all .xsl/.xml pairs
testcase_directory =3D r'C:\Dev\Me\Pyana\xsltmark\testcases'
testcase_files =3D os.listdir(testcase_directory)
testcase_files.sort()
last_name =3D None
tests =3D []
for file in testcase_files:
name, ext =3D os.path.splitext(file)
if ext =3D=3D '.xsl':
if last_name =3D=3D name:
tests.append(name)
last_name =3D None
elif ext =3D=3D '.xml':
last_name =3D name
else:
last_name =3D None
print 'time to import Pyana:',
startTime =3D time.clock()
import Pyana
print "%0.4fs" % (time.clock() - startTime)
for test in tests:
print 'time to execute %s:' % test,
startTime =3D time.clock()
length =3D len(Pyana.transformToString(source =3D =
Pyana.URI(os.path.join(testcase_directory, test + '.xml')),
style =3D =
Pyana.URI(os.path.join(testcase_directory, test + '.xsl'))))
print "%0.4fs (%d bytes of output)" % (time.clock() - startTime, =
length)
------=_NextPart_000_0036_01C1872A.6648D570--