[Python-Dev] shared data (was: Some thoughts on the codecs...)

Greg Ward gward@cnri.reston.va.us
Tue, 16 Nov 1999 09:10:33 -0500


On 16 November 1999, Greg Stein said:
> This is the reason Python starts up so slow and has a large memory
> footprint. There hasn't been any concern for moving stuff into shared data
> pages. As a result, a process must map in a bunch of vmem pages, for no
> other reason than to allocate Python structures in that memory and copy
> constants in.
> 
> Go start Perl 100 times, then do the same with Python. Python is
> significantly slower. I've actually written a web app in PHP because
> another one that I did in Python had slow response time.
> [ yah: the Real Man Answer is to write a real/good mod_python. ]

I don't think this is the only factor in startup overhead.  Try looking
into the number of system calls for the trivial startup case of each
interpreter:

  $ truss perl -e 1 2> perl.log 
  $ truss python -c 1 2> python.log

(This is on Solaris; I did the same thing on Linux with "strace", and on
IRIX with "par -s -SS".  Dunno about other Unices.)  The results are
interesting, and useful despite the platform and version disparities.

(For the record: Python 1.5.2 on all three platforms; Perl 5.005_03 on
Solaris, 5.004_05 on Linux, and 5.004_04 on IRIX.  The Solaris is 2.6,
using the Official CNRI Python Build by Barry, and the ditto Perl build
by me; the Linux system is starship, using whatever Perl and Python the
Starship Masters provide us with; the IRIX box is an elderly but
well-maintained SGI Challenge running IRIX 5.3.)

Also, this is with an empty PYTHONPATH.  The Solaris build of Python has
different prefix and exec_prefix, but on the Linux and IRIX builds, they
are the same.  (I think this will reflect poorly on the Solaris
version.)  PERLLIB, PERL5LIB, and Perl's builtin @INC should not affect
startup of the trivial "1" script, so I haven't paid attention to them.

First, the size of log files (in lines), i.e. number of system calls:

               Solaris     Linux    IRIX[1]
  Perl              88        85      70
  Python           425       316     257

[1] after chopping off the summary counts from the "par" output -- ie.
    these really are the number of system calls, not the number of
    lines in the log files

Next, the number of "open" calls:

               Solaris     Linux    IRIX
  Perl             16         10       9
  Python          107         71      48

(It looks as though *all* of the Perl 'open' calls are due to the
dynamic linker going through /usr/lib and/or /lib.)

And the number of unsuccessful "open" calls:

               Solaris     Linux    IRIX
  Perl              6          1       3
  Python           77         49      32

Number of "mmap" calls:

               Solaris     Linux    IRIX
  Perl              25        25       1
  Python            36        24       1

...nope, guess we can't blame mmap for any Perl/Python startup
disparity.

How about "brk":

               Solaris     Linux    IRIX
  Perl               6        11      12
  Python            47        39      25

...ok, looks like Greg's gripe about memory holds some water.

Rerunning "truss" on Solaris with "python -S -c 1" drastically reduces
the startup overhead as measured by "number of system calls".  Some
quick timing experiments show a drastic speedup (in wall-clock time) by
adding "-S": about 37% faster under Solaris, 56% faster under Linux, and
35% under IRIX.  These figures should be taken with a large grain of
salt, as the Linux and IRIX systems were fairly well loaded at the time,
and the wall-clock results I measured had huge variance.  Still, it gets
the point across.

Oh, also for the record, all timings were done like:

   perl -e 'for $i (1 .. 100) { system "python", "-S", "-c", "1"; }'

because I wanted to guarantee no shell was involved in the Python
startup.

        Greg
-- 
Greg Ward - software developer                    gward@cnri.reston.va.us
Corporation for National Research Initiatives    
1895 Preston White Drive                           voice: +1-703-620-8990
Reston, Virginia, USA  20191-5434                    fax: +1-703-620-0913