[Python-Dev] shared data (was: Some thoughts on the codecs...)
Greg Ward
gward@cnri.reston.va.us
Tue, 16 Nov 1999 09:10:33 -0500
On 16 November 1999, Greg Stein said:
> This is the reason Python starts up so slow and has a large memory
> footprint. There hasn't been any concern for moving stuff into shared data
> pages. As a result, a process must map in a bunch of vmem pages, for no
> other reason than to allocate Python structures in that memory and copy
> constants in.
>
> Go start Perl 100 times, then do the same with Python. Python is
> significantly slower. I've actually written a web app in PHP because
> another one that I did in Python had slow response time.
> [ yah: the Real Man Answer is to write a real/good mod_python. ]
I don't think this is the only factor in startup overhead. Try looking
into the number of system calls for the trivial startup case of each
interpreter:
$ truss perl -e 1 2> perl.log
$ truss python -c 1 2> python.log
(This is on Solaris; I did the same thing on Linux with "strace", and on
IRIX with "par -s -SS". Dunno about other Unices.) The results are
interesting, and useful despite the platform and version disparities.
(For the record: Python 1.5.2 on all three platforms; Perl 5.005_03 on
Solaris, 5.004_05 on Linux, and 5.004_04 on IRIX. The Solaris is 2.6,
using the Official CNRI Python Build by Barry, and the ditto Perl build
by me; the Linux system is starship, using whatever Perl and Python the
Starship Masters provide us with; the IRIX box is an elderly but
well-maintained SGI Challenge running IRIX 5.3.)
Also, this is with an empty PYTHONPATH. The Solaris build of Python has
different prefix and exec_prefix, but on the Linux and IRIX builds, they
are the same. (I think this will reflect poorly on the Solaris
version.) PERLLIB, PERL5LIB, and Perl's builtin @INC should not affect
startup of the trivial "1" script, so I haven't paid attention to them.
First, the size of log files (in lines), i.e. number of system calls:
Solaris Linux IRIX[1]
Perl 88 85 70
Python 425 316 257
[1] after chopping off the summary counts from the "par" output -- ie.
these really are the number of system calls, not the number of
lines in the log files
Next, the number of "open" calls:
Solaris Linux IRIX
Perl 16 10 9
Python 107 71 48
(It looks as though *all* of the Perl 'open' calls are due to the
dynamic linker going through /usr/lib and/or /lib.)
And the number of unsuccessful "open" calls:
Solaris Linux IRIX
Perl 6 1 3
Python 77 49 32
Number of "mmap" calls:
Solaris Linux IRIX
Perl 25 25 1
Python 36 24 1
...nope, guess we can't blame mmap for any Perl/Python startup
disparity.
How about "brk":
Solaris Linux IRIX
Perl 6 11 12
Python 47 39 25
...ok, looks like Greg's gripe about memory holds some water.
Rerunning "truss" on Solaris with "python -S -c 1" drastically reduces
the startup overhead as measured by "number of system calls". Some
quick timing experiments show a drastic speedup (in wall-clock time) by
adding "-S": about 37% faster under Solaris, 56% faster under Linux, and
35% under IRIX. These figures should be taken with a large grain of
salt, as the Linux and IRIX systems were fairly well loaded at the time,
and the wall-clock results I measured had huge variance. Still, it gets
the point across.
Oh, also for the record, all timings were done like:
perl -e 'for $i (1 .. 100) { system "python", "-S", "-c", "1"; }'
because I wanted to guarantee no shell was involved in the Python
startup.
Greg
--
Greg Ward - software developer gward@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913