[Python-Dev] Where the speed is lost! (was: 1.6 speed)
Christian Tismer
tismer@trixie.triqs.com
Mon, 24 Apr 2000 16:01:26 +0200
> Christian Tismer wrote:
> >
> > "A.M. Kuchling" wrote:
> > >
> > > Python 1.6a2 is around 10% slower than 1.5 on pystone.
> > > Any idea why?
...
> > Stackless 1.5.2+ is 10 percent faster than Stackless 1.6a2.
> >
> > Claim:
> > This is not related to ceval.c .
> > Something else must have introduced a significant speed loss.
I guess I can explain now what's happening, at least
for the Windows platform.
Python 1.5.2's .dll was nearly about 512K, something more.
I think to remember that 512K is a common size of the secondary
cache.
Now, linking with the MS linker does not give you any
particularly useful order of modules. When I look into
the map file, the modules appear sorted by name.
This is for sure not providing optimum performance.
As I read the docs, explicit ordering of the linkage
would only make sense for C++ and wouldn't work out
for C, since we could order the exported functions, but
not the private ones, giving even more distance between
releated code.
My solution to see if I might be right was this:
I ripped out almost all builtin extension modules
and compiled/linked without them. This shrunk
the dll size down from 647K to 557K, very close
to the 1.5.2 size.
Now I get the following figures:
Python 1.6, with stackless patches:
D:\python\spc\Python-slp\PCbuild>python /python/lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.95468
This machine benchmarks at 5115.92 pystones/second
Python 1.6, from the dist:
D:\Python16>python /python/lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 2.09214
This machine benchmarks at 4779.8 pystones/second
That means my optimizations are in charge again,
after the overall code size went below about 512K.
I think these 10 percent are quite valuable.
These options come to my mind:
a) try to do optimum code ordering in the too large .dll .
This seems to be hard to achieve.
b) Split the dll into two dll's in a way that all the
necessary internal stuff sits closely in one of them.
c) try to split the library like above, but use
a static library layout for one of them, and link the
static library into the final dll. This would hopefully
keep related things together.
I don't know if c) is possible, but it might be tried.
Any thoughts?
ciao - chris
--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com